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Abstract 

The contributions of the first half of this thesis are on the computational and 
algebraic aspects of convexity in polynomial optimization. We show that unless 
P=NP, there exists no polynomial time (or even pseudo-polynomial time) algo- 
rithm that can decide whether a multivariate polynomial of degree four (or higher 
even degree) is globally convex. This solves a problem that has been open since 
1992 when N. Z. Shor asked for the complexity of deciding convexity for quartic 
polynomials. We also prove that deciding strict convexity, strong convexity, qua- 
siconvexity, and pseudoconvexity of polynomials of even degree four or higher is 
strongly NP-hard. By contrast, we show that quasiconvexity and pseudoconvexity 
of odd degree polynomials can be decided in polynomial time. 

We then turn our attention to sos-convexity — an algebraic sum of squares 
(sos) based sufficient condition for polynomial convexity that can be efficiently 
checked with semidefinite programming. We show that three natural formulations 
for sos-convexity derived from relaxations on the definition of convexity, its first 
order characterization, and its second order characterization are equivalent. We 
present the first example of a convex polynomial that is not sos-convex. Our main 
result then is to prove that the cones of convex and sos-convex polynomials (resp. 
forms) in n variables and of degree d coincide if and only if n = 1 or d = 2 or 
(n, d) = (2,4) (resp. n = 2 or d = 2 or (n, d) = (3,4)). Although for disparate 
reasons, the remarkable outcome is that convex polynomials (resp. forms) are sos- 
convex exactly in cases where nonnegative polynomials (resp. forms) are sums of 
squares, as characterized by Hilbert in 1888. 

The contributions of the second half of this thesis are on the development 
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and analysis of computational techniques for certifying stability of uncertain and 
nonlinear dynamical systems. We show that deciding asymptotic stability of ho- 
mogeneous cubic polynomial vector fields is strongly NP-hard. We settle some of 
the converse questions on existence of polynomial and sum of squares Lyapunov 
functions. We present a globally asymptotically stable polynomial vector field 
with no polynomial Lyapunov function. We show via an explicit counterexample 
that if the degree of the polynomial Lyapunov function is fixed, then sos pro- 
gramming can fail to find a valid Lyapunov function even though one exists. By 
contrast, we show that if the degree is allowed to increase, then existence of a 
polynomial Lyapunov function for a planar or a homogeneous polynomial vec- 
tor field implies existence of a polynomial Lyapunov function that can be found 
with sos programming. We extend this result to develop a converse sos Lyapunov 
theorem for robust stability of switched linear systems. 

In our final chapter, we introduce the framework of path-complete graph Lya- 
punov functions for approximation of the joint spectral radius. The approach is 
based on the analysis of the underlying switched system via inequalities imposed 
between multiple Lyapunov functions associated to a labeled directed graph. In- 
spired by concepts in automata theory and symbolic dynamics, we define a class 
of graphs called path-complete graphs, and show that any such graph gives rise 
to a method for proving stability of switched systems. The semidefinite programs 
arising from this technique include as special case many of the existing methods 
such as common quadratic, common sum of squares, and maximum/minimum-of- 
quadratics Lyapunov functions. We prove approximation guarantees for analysis 
via several families of path-complete graphs and a constructive converse Lyapunov 
theorem for maximum/minimum-of-quadratics Lyapunov functions. 

Thesis Supervisor: Pablo A. Parrilo 

Title: Professor of Electrical Engineering and Computer Science 
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Introduction 



With the advent of modern computers in the last century and the rapid increase 
in our computing power ever since, more and more areas of science and engineer- 
ing are being viewed from a computational and algorithmic perspective — the field 
of optimization and control is no exception. Indeed, what we often regard nowa- 
days as a satisfactory solution to a problem in this field — may it be the optimal 
allocation of resources in a power network or the planning of paths of minimum 
fuel consumption for a group of satellites — is an efficient algorithm that when fed 
with an instance of the problem as input, returns in a reasonable amount of time 
an output that is guaranteed to be optimal or near optimal. 

Fundamental concepts from theory of computation, such as the notions of a 
Turing machine, decidability, polynomial time solvability, and the theory of NP- 
completeness, have allowed us to make precise what it means to have an (efficient) 
algorithm for a problem and much more remarkably to even be able to prove 
that for certain problems such algorithms do not exist. The idea of establishing 
"hardness results" to provide rigorous explanations for why progress on some 
problems tends to be relatively unsuccessful is commonly used today across many 
disciplines and rightly so. Indeed, when a problem is resisting all attempts for an 
(efficient) algorithm, little is more valuable to an unsatisfied algorithm designer 
than the ability to back up the statement "I cannot do it" with the claim that 
"it cannot be done". 

Over the years, the line between what can or cannot be efficiently computed 
has shown to be a thin one. There are many examples in optimization and control 
where complexity results reveal that two problems that on the surface appear 
quite similar have very different structural properties. Consider for example the 
problem of deciding given a symmetric matrix Q, whether x T Qx is nonnegative 
for all x e M n , and contrast this to the closely related problem of deciding whether 
x T Qx is nonnegative for all x's in W 1 that are elementwise nonnegative. The first 
problem, which is at the core of semidefinite programming, can be answered in 
polynomial time (in fact in 0(n 3 )), whereas the second problem, which forms 
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the basis of copositive programming, is NP-hard and can easily encode many 
hard combinatorial problems [109]. Similar scenarios arise in control theory. An 
interesting example is the contrast between the problems of deciding stability of 
interval polynomials and interval matrices. If we are given a single univariate 
polynomial of degree n or a single n x n matrix, then standard classical results 
enable us to decide in polynomial time whether the polynomial or the matrix 
is (strictly) stable, i.e, has all of its roots (resp. eigenvalues) in the open left 
half complex plane. Suppose now that we are given lower and upper bounds 
on the coefficients of the polynomial or on the entries of the matrix and we are 
asked to decide whether all polynomials or matrices in this interval family are 
stable. Can the answer still be given in polynomial time? For the case of interval 
polynomials, Kharitonov famously demonstrated [87] that it can: stability of 
an interval polynomial can be decided by checking whether four polynomials 
obtained from the family via some simple rules are stable. One may naturally 
speculate whether such a wonderful result can also be established for interval 
matrices, but alas, NP-hardness results [110] reveal that unless P=NP, this cannot 
happen. 

Aside from ending the quest for exact efficient algorithms, an NP-hardness 
result also serves as an insightful bridge between different areas of mathematics. 
Indeed, when we give a reduction from an NP-hard problem to a new problem of 
possibly different nature, it becomes apparent that the computational difficulties 
associated with the first problem are intrinsic also to the new problem. Con- 
versely, any algorithm that has been previously developed for the new problem 
can now readily be applied also to the first problem. This concept is usually par- 
ticularly interesting when one problem is in the domain of discrete mathematics 
and the other in the continuous domain, as will be the case for problems con- 
sidered in this thesis. For example, we will give a reduction from the canonical 
NP-complete problem of 3SAT to the problem of deciding stability of a certain 
class of differential equations. As a byproduct of the reduction, it will follow that 
a certificate of unsatisfiability of instances of 3SAT can always be given in form 
of a Lyapunov function. 

In general, hardness results in optimization come with a clear practical im- 
plication: as an algorithm designer, we either have to give up optimality and be 
content with finding suboptimal solutions, or we have to work with a subclass 
of problems that have more tractable attributes. In view of this, it becomes ex- 
ceedingly relevant to identify structural properties of optimization problems that 
allow for tractability of finding optimal solutions. 

One such structural property, which by and large is the most fundamental one 
that we know of, is convexity. As a geometric property, convexity comes with 
many attractive consequences. For instance, every local minimum of a convex 
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problem is also a global minimum. Or for example, if a point does not belong to 
a convex set, this nonmembership can be certified through a separating hyper- 
plane. Due in part to such special attributes, convex problems generally allow for 
efficient algorithms for solving them. Among other approaches, a powerful theory 
of interior-point polynomial time methods for convex optimization was developed 
in [111]. At least when the underlying convex cone has an efficiently computable 
so-called "barrier function", these algorithms are efficient both in theory and in 
practice. 

Extensive and greatly successful research in the applications of convex opti- 
mization over the last couple of decades has shown that surprisingly many prob- 
lems of practical importance can be cast as convex optimization problems. More- 
over, we have a fair number of rules based on the calculus of convex functions 
that allow us to design — whenever we have the freedom to do so — problems that 
are by construction convex. Nevertheless, in order to be able to exploit the po- 
tential of convexity in optimization in full, a very basic question is to understand 
whether we are even able to recognize the presence of convexity in optimization 
problems. In other words, can we have an efficient algorithm that tests whether 
a given optimization problem is convex? 

We will show in this thesis — answering a longstanding question of N.Z. Shor— 
that unfortunately even for the simplest classes of optimization problems where 
the objective function and the defining functions of the feasible set are given by 
polynomials of modest degree, the question of determining convexity is NP-hard. 
We also show that the same intractability result holds for essentially any well- 
known variant of convexity (generalized convexity). These results suggest that as 
significant as convexity may be in optimization, we may not be able to in general 
guarantee its presence before we can enjoy its consequences. 

Of course, NP-hardness of a problem does not stop us from studying it, but 
on the contrary stresses the need for finding good approximation algorithms that 
can deal with a large number of instances efficiently. Towards this goal, we will 
devote part of this thesis to a study of convexity from an algebraic viewpoint. 
We will argue that in many cases, a notion known as sos- convexity, which is an 
efficiently checkable algebraic counterpart of convexity, can be a viable substi- 
tute for convexity of polynomials. Aside from its computational implications, 
sos-convexity has recently received much attention in the area of convex algebraic 
geometry [26], [55], [75], [89], [90], [91], mainly due to its role in connecting the geo- 
metric and algebraic aspects of convexity. In particular, the name "sos-convexity" 
comes from the work of Helton and Nie on semidefinite representability of convex 
sets [75]. 

The basic idea behind sos-convexity is nothing more than a simple extension of 
the concept of representation of nonnegative polynomials as sums of squares. To 
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demonstrate this idea on a concrete example, suppose we are given the polynomial 

p(x) = x\ — 6xfx2 + 2xfx3 + §x\x\ + §x\x\ — 6x^X2X3 — 1AxiX2x\ + Ax\x\ 

" • , (L1) 

and we are asked to decide whether it is nonnegative, i.e, whether p(x) > for all 

x := (xi, x 2 , X3) in IR 3 . This may seem like a daunting task (and indeed it is as 

deciding nonnegativity of quartic polynomials is also NP-hard), but suppose that 

we could "somehow" come up with a decomposition of the polynomial a sum of 

squares (sos): 

p(x) = (x\ - 3xix 2 + xxx 3 + 2xl) 2 + (xix 3 - x 2 x 3 ) 2 + (4x1 ~ x i) 2 - (I- 2 ) 

Then, we have at our hands an explicit certificate of nonnegativity oip(x), which 
can be easily checked (simply by multiplying the terms out). 

It turns out (see e.g. [118], [119]) that because of several interesting connections 
between real algebra and convex optimization discovered in recent years and quite 
well-known by now, the question of existence of an sos decomposition can be cast 
as a semidefinite program, which can be solved efficiently e.g. by interior point 
methods. As we will see more formally later, the notion of sos-convexity is based 
on an appropriately defined sum of squares decomposition of the Hessian matrix 
of a polynomial and hence it can also be checked efficiently with semidefinite 
programming. Just like sum of squares decomposition is a sufficient condition for 
polynomial nonnegativity, sos-convexity is a sufficient condition for polynomial 
convexity. 

An important question that remains here is the obvious one: when do nonneg- 
ative polynomials admit a decomposition as a sum of squares? The answer to this 
question comes from a classical result of Hilbert. In his seminal 1888 paper [77], 
Hilbert gave a complete characterization of the degrees and dimensions in which 
all nonnegative polynomials can be written as sums of squares. In particular, he 
proved that there exist nonnegative polynomials with no sum of squares decom- 
position, although explicit examples of such polynomials appeared only 80 years 
later. One of the main contributions of this thesis is to establish the counterpart 
of Hilbert 's results for the notions of convexity and sos-convexity. In particular, 
we will give the first example of a convex polynomial that is not sos-convex, and 
by the end of the first half of this thesis, a complete characterization of the de- 
grees and dimensions in which convexity and sos-convexity are equivalent. Some 
interesting and unexpected connections to Hilbert's results will also emerge in the 
process. 

In the second half of this thesis, we will turn to the study of stability in 
dynamical systems. Here too, we will take a computational viewpoint with our 
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goal will being the development and analysis of efficient algorithms for proving 
stability of certain classes of nonlinear and hybrid systems. 

Almost universally, the study of stability in systems theory leads to Lyapunov's 
second method or one of its many variants. An outgrowth of Lyapunov's 1892 
doctoral dissertation [99], Lyapunov's second method tells us, roughly speaking, 
that if we succeed in finding a Lyapunov function — an energy-like function of the 
state that decreases along trajectories — then we have proven that the dynamical 
system in question is stable. In the mid 1900s, a series of converse Lyapunov 
theorems were developed which established that any stable system indeed has a 
Lyapunov function (see [72, Chap. 6] for an overview). Although this is encour- 
aging, except for the simplest classes of systems such as linear systems, converse 
Lyapunov theorems do not provide much practical insight into how one may go 
about finding a Lyapunov function. 

In the last few decades however, advances in the theory and practice of con- 
vex optimization and in particular semidefinite programming (SDP) have reju- 
venated Lyapunov theory. The approach has been to parameterize a class of 
Lyapunov functions with restricted complexity (e.g., quadratics, pointwise maxi- 
mum of quadratics, polynomials, etc.) and then pose the search for a Lyapunov 
function as a convex feasibility problem. A widely popular example of this frame- 
work which we will revisit later in this thesis is the method of sum of squares 
Lyapunov functions [118], [121]. Expanding on the concept of sum of squares 
decomposition of polynomials described above, this technique allows one to for- 
mulate semidefinite programs that search for polynomial Lyapunov functions for 
polynomial dynamical systems. Sum of squares Lyapunov functions, along with 
many other SDP based techniques, have also been applied to systems that un- 
dergo switching; see e.g. [136], [131], [122]. The analysis of these types of systems 
will also be a subject of interest in this thesis. 

An algorithmic approach to Lyapunov theory naturally calls for new converse 
theorems. Indeed, classical converse Lyapunov theorems only guarantee existence 
of Lyapunov functions within very broad classes of functions (e.g. the class of 
continuously differentiable functions) that are a priori not amenable to compu- 
tation. So there is the need to know whether Lyapunov functions belonging to 
certain more restricted classes of functions that can be computationally searched 
over also exist. For example, do stable polynomial systems admit Lyapunov func- 
tions that are polynomial? What about polynomial functions that can be found 
with sum of squares techniques? Similar questions arise in the case of switched 
systems. For example, do stable linear switched systems admit sum of squares 
Lyapunov functions? How about Lyapunov functions that are the pointwise max- 
imum of quadratics? If so, how many quadratic functions are needed? We will 
answer several questions of this type in this thesis. 
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This thesis will also introduce a new class of techniques for Lyapunov analysis 
of switched systems. The novel component here is a general framework for formu- 
lating Lyapunov inequalities between multiple Lyapunov functions that together 
guarantee stability of a switched system under arbitrary switching. The relation 
between these inequalities has interesting links to concepts from automata theory. 
Furthermore, the technique is amenable to semidefinite programming. 

Although the main ideas behind our approach directly apply to broader classes 
of switched systems, our results will be presented in the more specific context of 
switched linear systems. This is mainly due to our interest in the notion of the 
joint spectral radius of a set of matrices which has intimate connections to stabil- 
ity of switched linear systems. The joint spectral radius is an extensively studied 
quantity that characterizes the maximum growth rate obtained by taking arbi- 
trary products from a set of matrices. Computation of the joint spectral radius, 
although notoriously hard [35], [161], has a wide range of applications including 
continuity of wavelet functions, computation of capacity of codes, convergence 
of consensus algorithms, and combinatorics, just to name a few. Our techniques 
provide several hierarchies of polynomial time algorithms that approximate the 
JSR with guaranteed accuracy. 

A more concrete account of the contributions of this thesis will be given in the 
following section. We remark that although the first half of the thesis is mostly 
concerned with convexity in polynomial optimization and the second half with 
Lyapunov analysis, a common theme throughout the thesis is the use of algorithms 
that involve algebraic methods in optimization and semidefinite programming. 

I 1.1 Outline and contributions of the thesis 

The remainder of this thesis is divided into two parts each containing two chap- 
ters. The first part includes our complexity results on deciding convexity in 
polynomial optimization (Chapter 2) and our study of the relationship between 
convexity and sos-convexity (Chapter 3). The second part includes new results on 
Lyapunov analysis of polynomial differential equations (Chapter 4) and a novel 
framework for proving stability of switched systems (Chapter 5). A summary of 
our contributions in each chapter is as follows. 

Chapter 2. The main result of this chapter is to prove that unless P=NP, there 
cannot be a polynomial time algorithm (or even a pseudo-polynomial time al- 
gorithm) that can decide whether a quartic polynomial is globally convex. This 
answers a question of N.Z. Shor that appeared as one of seven open problems in 
complexity theory for numerical optimization in 1992 [117]. We also show that 
deciding strict convexity, strong convexity, quasiconvexity, and pseudoconvexity 
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of polynomials of even degree four or higher is strongly NP-hard. By contrast, we 
show that quasiconvexity and pseudoconvexity of odd degree polynomials can be 
decided in polynomial time. 

Chapter 3. Our first contribution in this chapter is to prove that three natu- 
ral sum of squares (sos) based sufficient conditions for convexity of polynomials 
via the definition of convexity, its first order characterization, and its second or- 
der characterization are equivalent. These three equivalent algebraic conditions, 
which we will refer to as sos-convexity, can be checked by solving a single semidef- 
inite program. We present the first known example of a convex polynomial that 
is not sos-convex. We explain how this polynomial was found with tools from sos 
programming and duality theory of semidefinite optimization. As a byproduct 
of this numerical procedure, we obtain a simple method for searching over a re- 
stricted family of nonnegative polynomials that are not sums of squares that can 
be of independent interest. 

If we denote the set of convex and sos-convex polynomials in n variables of 
degree d with C n) d and SC n ^ respectively, then our main contribution in this 
chapter is to prove that C n ^ = X6 n , d if and only if n — 1 or d — 2 or (n, d) = (2, 4). 
We also present a complete characterization for forms (homogeneous polynomials) 
except for the case (n, d) = (3, 4) which will appear elsewhere [2]. Our result states 
that the set C n ^ of convex forms in n variables of degree d equals the set T,C nt d 
of sos-convex forms if and only if n = 2 or d = 2 or (n, d) = (3, 4). To prove these 
results, we present in particular explicit examples of polynomials in C 2 ,6 \ SC 2j6 
and 6*3,4 \SC3 5 4 and forms in 6*3,6 \ 2X73,6 an d 6*4,4 \2X7 44 , and a general procedure 
for constructing forms in C n ^+2 \ 5X7 n ,d+2 from nonnegative but not sos forms in 
n variables and degree d. 

Although for disparate reasons, the remarkable outcome is that convex polyno- 
mials (resp. forms) are sos-convex exactly in cases where nonnegative polynomials 
(resp. forms) are sums of squares, as characterized by Hilbert. 

Chapter 4. This chapter is devoted to converse results on (non)-existence of poly- 
nomial and sum of squares polynomial Lyapunov functions for systems described 
by polynomial differential equations. We present a simple, explicit example of a 
two-dimensional polynomial vector field of degree two that is globally asymptot- 
ically stable but does not admit a polynomial Lyapunov function of any degree. 
We then study whether existence of a polynomial Lyapunov function implies ex- 
istence of one that can be found with sum of squares techniques. We show via an 
explicit counterexample that if the degree of the polynomial Lyapunov function 
is fixed, then sos programming can fail to find a valid Lyapunov function even 
though one exists. On the other hand, if the degree is allowed to increase, we 
prove that existence of a polynomial Lyapunov function for a planar vector field 
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(under an additional mild assumption) or for a homogeneous vector field implies 
existence of a polynomial Lyapunov function that is sos and that the negative of 
its derivative is also sos. This result is extended to prove that asymptotic stability 
of switched linear systems can always be proven with sum of squares Lyapunov 
functions. Finally, we show that for the latter class of systems (both in discrete 
and continuous time), if the negative of the derivative of a Lyapunov function 
is a sum of squares, then the Lyapunov function itself is automatically a sum of 
squares. 

This chapter also includes some complexity results. We prove that deciding 
asymptotic stability of homogeneous cubic polynomial vector fields is strongly NP- 
hard. We discuss some byproducts of the reduction that establishes this result, 
including a Lyapunov-inspired technique for proving positivity of forms. 

Chapter 5. In this chapter, we introduce the framework of path-complete graph 
Lyapunov functions for approximation of the joint spectral radius. The approach 
is based on the analysis of the underlying switched system via inequalities imposed 
between multiple Lyapunov functions associated to a labeled directed graph. The 
nodes of this graph represent Lyapunov functions, and its directed edges that 
are labeled with matrices represent Lyapunov inequalities. Inspired by concepts 
in automata theory and symbolic dynamics, we define a class of graphs called 
path-complete graphs, and show that any such graph gives rise to a method for 
proving stability of the switched system. This enables us to derive several asymp- 
totically tight hierarchies of semidefinite programming relaxations that unify and 
generalize many existing techniques such as common quadratic, common sum of 
squares, and maximum/minimum-of-quadratics Lyapunov functions. 

We compare the quality of approximation obtained by certain families of path- 
complete graphs including all path-complete graphs with two nodes on an alpha- 
bet of two matrices. We argue that the De Bruijn graph of order one on m 
symbols, with quadratic Lyapunov functions assigned to its nodes, provides good 
estimates of the JSR of m matrices at a modest computational cost. We prove 
that the bound obtained via this method is invariant under transposition of the 
matrices and always within a multiplicative factor of 1/ \fn of the true JSR (in- 
dependent of the number of matrices). 

Approximation guarantees for analysis via other families of path-complete 
graphs will also be provided. In particular, we show that the De Bruijn graph of 
order k, with quadratic Lyapunov functions as nodes, can approximate the JSR 
with arbitrary accuracy as k increases. This also proves that common Lyapunov 
functions that are the pointwise maximum (or minimum) of quadratics always 
exist. Moreover, the result gives a bound on the number of quadratic functions 
needed to achieve a desired level of accuracy in approximation of the JSR, and 



Sec. 1.1. Outline and contributions of the thesis 



23 



also demonstrates that these quadratic functions can be found with semidefinite 
programming. 

A list of open problems for future research is presented at the end of each chapter. 
■ 1.1.1 Related publications 

The material presented in this thesis is in the most part based on the following 
papers. 

Chapter 2. 

A. A. Ahmadi, A. Olshevsky, P. A. Parrilo, and J. N. Tsitsiklis. NP-hardness 
of deciding convexity of quartic polynomials and related problems. Mathemat- 
ical Programming, 2011. Accepted for publication. Online version available at 
arXiv:. 1012. 1908. 

Chapter 3. 

A. A. Ahmadi and P. A. Parrilo. A convex polynomial that is not sos-convex. 
Mathematical Programming, 2011. DOI: 10.1007/sl0107-011-0457-z. 

A. A. Ahmadi and P. A. Parrilo. A complete characterization of the gap between 
convexity and sos- convexity. In preparation, 2011. 

A. A. Ahmadi, G. Blekherman, and P. A. Parrilo. Convex ternary quartics are 
sos-convex. In preparation, 2011. 

Chapter 4. 

A. A. Ahmadi and P. A. Parrilo. Converse results on existence of sum of squares 
Lyapunov functions. In Proceedings of the 50 th IEEE Conference on Decision 
and Control, 2011. 

A. A. Ahmadi, M. Krstic, and P. A. Parrilo. A globally asymptotically stable 
polynomial vector field with no polynomial Lyapunov function. In Proceedings of 
the 50 th IEEE Conference on Decision and Control, 2011. 

Chapter 5. 

A. A. Ahmadi, R. Jungers, P. A. Parrilo, and M. Roozbehani. Analysis of the 
joint spectral radius via Lyapunov functions on path-complete graphs. In Hybrid 
Systems: Computation and Control 2011, Lecture Notes in Computer Science. 
Springer, 2011. 
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Complexity of Deciding Convexity 



In this chapter, we characterize the computational complexity of deciding convex- 
ity and many of its variants in polynomial optimization. The material presented 
in this chapter is based on the work in [5]. 

I 2.1 Introduction 

The role of convexity in modern day mathematical programming has proven to be 
remarkably fundamental, to the point that tractability of an optimization problem 
is nowadays assessed, more often than not, by whether or not the problem benefits 
from some sort of underlying convexity. In the famous words of Rockafellar [143] : 

"In fact the great watershed in optimization isn't between linearity and non- 
linearity, but convexity and nonconvexity." 

But how easy is it to distinguish between convexity and nonconvexity? Can we 
decide in an efficient manner if a given optimization problem is convex? 

A class of optimization problems that allow for a rigorous study of this question 
from a computational complexity viewpoint is the class of polynomial optimiza- 
tion problems. These are optimization problems where the objective is given by a 
polynomial function and the feasible set is described by polynomial inequalities. 
Our research in this direction was motivated by a concrete question of N. Z. Shor 
that appeared as one of seven open problems in complexity theory for numerical 
optimization put together by Pardalos and Vavasis in 1992 [117]: 

"Given a degree-4 polynomial in n variables, what is the complexity of de- 
termining whether this polynomial describes a convex function?" 

As we will explain in more detail shortly, the reason why Shor's question is specif- 
ically about degree 4 polynomials is that deciding convexity of odd degree poly- 
nomials is trivial and deciding convexity of degree 2 (quadratic) polynomials can 
be reduced to the simple task of checking whether a constant matrix is positive 
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semidefinite. So, the first interesting case really occurs for degree 4 (quartic) poly- 
nomials. Our main contribution in this chapter (Theorem 2.1 in Section 2.2.3) is 
to show that deciding convexity of polynomials is strongly NP-hard already for 
polynomials of degree 4. 

The implication of NP-hardness of this problem is that unless P=NP, there 
exists no algorithm that can take as input the (rational) coefficients of a quar- 
tic polynomial, have running time bounded by a polynomial in the number of 
bits needed to represent the coefficients, and output correctly on every instance 
whether or not the polynomial is convex. Furthermore, the fact that our NP- 
hardness result is in the strong sense (as opposed to weakly NP-hard problems 
such as KNAPSACK) implies, roughly speaking, that the problem remains NP- 
hard even when the magnitude of the coefficients of the polynomial are restricted 
to be "small." For a strongly NP-hard problem, even a pseudo-polynomial time 
algorithm cannot exist unless P=NP. See [61] for precise definitions and more 
details. 

There are many areas of application where one would like to establish con- 
vexity of polynomials. Perhaps the simplest example is in global minimization 
of polynomials, where it could be very useful to decide first whether the poly- 
nomial to be optimized is convex. Once convexity is verified, then every local 
minimum is global and very basic techniques (e.g., gradient descent) can find 
a global minimum — a task that is in general NP-hard in the absence of con- 
vexity [124], [109]. As another example, if we can certify that a homogeneous 
polynomial is convex, then we define a gauge (or Minkowski) norm based on 
its convex sublevel sets, which may be useful in many applications. In several 
other problems of practical relevance, we might not just be interested in checking 
whether a given polynomial is convex, but to parameterize a family of convex 
polynomials and perhaps search or optimize over them. For example we might 
be interested in approximating the convex envelope of a complicated nonconvex 
function with a convex polynomial, or in fitting a convex polynomial to a set of 
data points with minimum error [100]. Not surprisingly, if testing membership to 
the set of convex polynomials is hard, searching and optimizing over that set also 
turns out to be a hard problem. 

We also extend our hardness result to some variants of convexity, namely, the 
problems of deciding strict convexity, strong convexity, pseudoconvexity, and qua- 
siconvexity of polynomials. Strict convexity is a property that is often useful to 
check because it guarantees uniqueness of the optimal solution in optimization 
problems. The notion of strong convexity is a common assumption in conver- 
gence analysis of many iterative Newton-type algorithms in optimization theory; 
see, e.g., [38, Chaps. 9-11]. So, in order to ensure the theoretical convergence 
rates promised by many of these algorithms, one needs to first make sure that 
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the objective function is strongly convex. The problem of checking quasiconvex- 
ity (convexity of sublevel sets) of polynomials also arises frequently in practice. 
For instance, if the feasible set of an optimization problem is defined by poly- 
nomial inequalities, by certifying quasiconvexity of the defining polynomials we 
can ensure that the feasible set is convex. In several statistics and clustering 
problems, we are interested in finding minimum volume convex sets that contain 
a set of data points in space. This problem can be tackled by searching over the 
set of quasiconvex polynomials [100]. In economics, quasiconcave functions are 
prevalent as desirable utility functions [92], [18]. In control and systems theory, 
it is useful at times to search for quasiconvex Lyapunov functions whose convex 
sublevel sets contain relevant information about the trajectories of a dynamical 
system [44], [8]. Finally, the notion of pseudoconvexity is a natural generalization 
of convexity that inherits many of the attractive properties of convex functions. 
For example, every stationary point or every local minimum of a pseudoconvex 
function must be a global minimum. Because of these nice features, pseudoconvex 
programs have been studied extensively in nonlinear programming [101], [48]. 

As an outcome of close to a century of research in convex analysis, numerous 
necessary, sufficient, and exact conditions for convexity and all of its variants 
are available; see, e.g., [38, Chap. 3], [104], [60], [49], [92], [102] and references 
therein for a by no means exhaustive list. Our results suggest that none of the 
exact characterizations of these notions can be efficiently checked for polynomials. 
In fact, when turned upside down, many of these equivalent formulations reveal 
new NP-hard problems; see, e.g., Corollary 2.6 and 2.8. 

■ 2.1.1 Related Literature 

There are several results in the literature on the complexity of various special 
cases of polynomial optimization problems. The interested reader can find many 
of these results in the edited volume of Pardalos [116] or in the survey papers of 
de Klerk [54], and Blondel and Tsitsiklis [36]. A very general and fundamental 
concept in certifying feasibility of polynomial equations and inequalities is the 
Tarski-Seidenberg quantifier elimination theory [158], [154], from which it follows 
that all of the problems that we consider in this chapter are algorithmically decid- 
able. This means that there are algorithms that on all instances of our problems 
of interest halt in finite time and always output the correct yes-no answer. Un- 
fortunately, algorithms based on quantifier elimination or similar decision algebra 
techniques have running times that are at least exponential in the number of 
variables [24], and in practice can only solve problems with very few parameters. 

When we turn to the issue of polynomial time solvability, perhaps the most 
relevant result for our purposes is the NP-hardness of deciding nonnegativity of 
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quartic polynomials and biquadratic forms (see Definition 2.2); the main reduc- 
tion that we give in this chapter will in fact be from the latter problem. As we 
will see in Section 2.2.3, it turns out that deciding convexity of quartic forms is 
equivalent to checking nonnegativity of a special class of biquadratic forms, which 
are themselves a special class of quartic forms. The NP-hardness of checking non- 
negativity of quartic forms follows, e.g., as a direct consequence of NP-hardness 
of testing matrix copositivity, a result proven by Murty and Kabadi [109]. As 
for the hardness of checking nonnegativity of biquadratic forms, we know of two 
different proofs. The first one is due to Gurvits [70], who proves that the en- 
tanglement problem in quantum mechanics (i.e., the problem of distinguishing 
separable quantum states from entangled ones) is NP-hard. A dual reformulation 
of this result shows directly that checking nonnegativity of biquadratic forms is 
NP-hard; see [59]. The second proof is due to Ling et al. [97], who use a theo- 
rem of Motzkin and Straus to give a very short and elegant reduction from the 
maximum clique problem in graphs. 

The only work in the literature on the hardness of deciding polynomial con- 
vexity that we are aware of is the work of Guo on the complexity of deciding 
convexity of quartic polynomials over simplices [69]. Guo discusses some of the 
difficulties that arise from this problem, but he does not prove that deciding con- 
vexity of polynomials over simplices is NP-hard. Canny shows in [40] that the 
existential theory of the real numbers can be decided in PSPACE. From this, it 
follows that testing several properties of polynomials, including nonnegativity and 
convexity, can be done in polynomial space. In [112], Nie proves that the related 
notion of matrix convexity is NP-hard for polynomial matrices whose entries are 
quadratic forms. 

On the algorithmic side, several techniques have been proposed both for testing 
convexity of sets and convexity of functions. Rademacher and Vempala present 
and analyze randomized algorithms for testing the relaxed notion of approximate 
convexity [135]. In [91], Lasserre proposes a semidefinite programming hierarchy 
for testing convexity of basic closed semialgebraic sets; a problem that we also 
prove to be NP-hard (see Corollary 2.8). As for testing convexity of functions, an 
approach that some convex optimization parsers (e.g., CVX [66]) take is to start 
with some ground set of convex functions and then check whether the desired 
function can be obtained by applying a set of convexity preserving operations 
to the functions in the ground set [50], [38, p. 79]. Techniques of this type that 
are based on the calculus of convex functions are successful for a large range of 
applications. However, when applied to general polynomial functions, they can 
only detect a subclass of convex polynomials. 

Related to convexity of polynomials, a concept that has attracted recent at- 
tention is the algebraic notion of sos-convexity (see Definition 2.4) [75], [89], [90], 
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[8], [100], [44], [11]. This is a powerful sufficient condition for convexity that relies 
on an appropriately defined sum of squares decomposition of the Hessian matrix, 
and can be efficiently checked by solving a single semidefinite program. The study 
of sos-convexity will be the main focus of our next chapter. In particular, we will 
present explicit counterexamples to show that not every convex polynomial is sos- 
convex. The NP-hardness result in this chapter certainly justifies the existence 
of such counterexamples and more generally suggests that any polynomial time 
algorithm attempted for checking polynomial convexity is doomed to fail on some 
hard instances. 

I 2.1.2 Contributions and organization of this chapter 

The main contribution of this chapter is to establish the computational complex- 
ity of deciding convexity, strict convexity, strong convexity, pseudoconvexity, and 
quasiconvexity of polynomials for any given degree. (See Table 2.1 in Section 2.5 
for a quick summary.) The results are mainly divided in three sections, with Sec- 
tion 2.2 covering convexity, Section 2.3 covering strict and strong convexity, and 
Section 2.4 covering quasiconvexity and pseudoconvexity. These three sections 
follow a similar pattern and are each divided into three parts: first, the defini- 
tions and basics, second, the degrees for which the questions can be answered in 
polynomial time, and third, the degrees for which the questions are NP-hard. 

Our main reduction, which establishes NP-hardness of checking convexity of 
quartic forms, is given in Section 2.2.3. This hardness result is extended to strict 
and strong convexity in Section 2.3.3, and to quasiconvexity and pseudoconvexity 
in Section 2.4.3. By contrast, we show in Section 2.4.2 that quasiconvexity and 
pseudoconvexity of odd degree polynomials can be decided in polynomial time. A 
summary of the chapter and some concluding remarks are presented in Section 2.5. 

H 2.2 Complexity of deciding convexity 
I 2.2.1 Definitions and basics 

A (multivariate) polynomial p(x) in variables x := (xi, . . . , x n ) T is a function from 
W 1 to R that is a finite linear combination of monomials: 

p(x) = J2c a x a = Cctu-,*^! 1 (2-1) 

a ai,...,a n 

where the sum is over n-tuples of nonnegative integers aij. An algorithm for 
testing some property of polynomials will have as its input an ordered list of 
the coefficients c a . Since our complexity results are based on models of digital 
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computation, where the input must be represented by a finite number of bits, the 
coefficients c a for us will always be rational numbers, which upon clearing the 
denominators can be taken to be integers. So, for the remainder of the chapter, 
even when not explicitly stated, we will always have c a G Z. 

The degree of a monomial x a is equal to a± + • • • + a n . The degree of a 
polynomial p(x) is defined to be the highest degree of its component monomials. 
A simple counting argument shows that a polynomial of degree d in n variables 
has ( n ~^ d ) coefficients. A homogeneous polynomial (or a form) is a polynomial 
where all the monomials have the same degree. A form p{x) of degree d is a 
homogeneous function of degree d (since it satisfies p(Xx) = \ d p(x)), and has 
( n+d d ~ 1 ) coefficients. 

A polynomial p(x) is said to be nonnegative or positive semidefinite (psd) if 
p(x) > for all x G M. n . Clearly, a necessary condition for a polynomial to be psd 
is for its degree to be even. We say that p(x) is a sum of squares (sos), if there exist 
polynomials qi(x), . . . , q m (x) such that p(x) = YlT=i 1^{ x )- Every sos polynomial 
is obviously psd. A polynomial matrix P(x) is a matrix with polynomial entries. 
We say that a polynomial matrix P(x) is PSD (denoted P(x) >z 0) if it is positive 
semidefinite in the matrix sense for every value of the indeterminates x. (Note 
the upper case convention for matrices.) It is easy to see that P(x) is PSD if and 
only if the scalar polynomial y T P(x)y in variables (x; y) is psd. 

We recall that a polynomial p(x) is convex if and only if its Hessian matrix, 
which will be generally denoted by H(x), is PSD. 

■ 2.2.2 Degrees that are easy 

The question of deciding convexity is trivial for odd degree polynomials. Indeed, it 
is easy to check that linear polynomials (d = 1) are always convex and that polyno- 
mials of odd degree d > 3 can never be convex. The case of quadratic polynomials 
(d = 2) is also straightforward. A quadratic polynomial p(x) = \x T Qx + q T x + c 
is convex if and only if the constant matrix Q is positive semidefinite. This can 
be decided in polynomial time for example by performing Gaussian pivot steps 
along the main diagonal of Q [109] or by computing the characteristic polynomial 
of Q exactly and then checking that the signs of its coefficients alternate [79, p. 
403]. 

Unfortunately, the results that come next suggest that the case of quadratic 
polynomials is essentially the only nontrivial case where convexity can be effi- 
ciently decided. 
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I 2.2.3 Degrees that are hard 

The main hardness result of this chapter is the following theorem. 

Theorem 2.1. Deciding convexity of degree four polynomials is strongly NP-hard. 
This is true even when the polynomials are restricted to be homogeneous. 

We will give a reduction from the problem of deciding nonnegativity of bi- 
quadratic forms. We start by recalling some basic facts about biquadratic forms 
and sketching the idea of the proof. 

Definition 2.2. A biquadratic form b(x;y) is a form in the variables 
x = (xi, . . . , x n ) T and y = (yi, . . . , y m ) T that can be written as 

b(x; y) = ^ a ijkiXiXjy k yi. (2.2) 

i<j, k<l 

Note that for fixed x, b(x; y) becomes a quadratic form in y, and for fixed y, 
it becomes a quadratic form in x. Every biquadratic form is a quartic form, but 
the converse is of course not true. It follows from a result of Ling et al. [97] that 
deciding nonnegativity of biquadratic forms is strongly NP-hard. This claim is 
not precisely stated in this form in [97]. For the convenience of the reader, let 
us make the connection more explicit before we proceed, as this result underlies 
everything that follows. 

The argument in [97] is based on a reduction from CLIQUE (given a graph 
G(V, E) and a positive integer k < \V\, decide whether G contains a clique of 
size k or more) whose (strong) NP-hardness is well-known [61]. For a given graph 
G(V, E) on n nodes, if we define the biquadratic form bc(x;y) in the variables 
x = (x 1 ,..., x n ) T and y = (y 1: y n ) T by 

b G (x;y) = -2 ^ x i x jViVj^ 

then Ling et al. [97] use a theorem of Motzkin and Straus [108] to show 

min b G (x;y) = -1 + — -. (2.3) 
IMI=IMI=i u(G) 

Here, oj(G) denotes the clique number of the graph G, i.e., the size of a maximal 
clique. 1 From this, we see that for any value of k, oo(G) < k if and only if 

, / \ 1 — As 
mm bnix^y) > — : — , 

Nl=llvll=i 1 ^ " k 



1 Equation (2.3) above is stated in [97] with the stability number a(G) in place of the clique 
number oj(G). This seems to be a minor typo. 
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which by homogenization holds if and only if the biquadratic form 

b G {x; y) = -2k x i x jViVj ~ (! - *0 ( ^2 x i) \J2 y i) 

(i,j)eE \i=i J \i=i J 

is nonnegative. Hence, by checking nonnegativity of 6q(x] y) for all values of 
k G {1, . . . , n — 1}, we can find the exact value of uo(G). It follows that deciding 
nonnegativity of biquadratic forms is NP-hard, and in view of the fact that the 
coefficients of ba(x;y) are all integers with absolute value at most 2n — 2, the 
NP-hardness claim is in the strong sense. Note also that the result holds even 
when n = m in Definition 2.2. In the sequel, we will always have n = m. 

It is not difficult to see that any biquadratic form b(x; y) can be written in the 
form 

b{x-y) =y T A{x)y (2.4) 

(or of course as x T B(y)x) for some symmetric polynomial matrix A(x) whose 
entries are quadratic forms. Therefore, it is strongly NP-hard to decide whether 
a symmetric polynomial matrix with quadratic form entries is PSD. One might 
hope that this would lead to a quick proof of NP-hardness of testing convexity 
of quartic forms, because the Hessian of a quartic form is exactly a symmetric 
polynomial matrix with quadratic form entries. However, the major problem that 
stands in the way is that not every polynomial matrix is a valid Hessian. Indeed, 
if any of the partial derivatives between the entries of A{x) do not commute (e.g., 
.£ dAnCtt) _^ 3A qI^ ) , then A{x) cannot be the matrix of second derivatives of some 
polynomial. This is because all mixed third partial derivatives of polynomials 
must commute. 

Our task is therefore to prove that even with these additional constraints 
on the entries of A(x), the problem of deciding positive semidefiniteness of such 
matrices remains NP-hard. We will show that any given symmetric n x n matrix 
A(x), whose entries are quadratic forms, can be embedded in a 2n x In polynomial 
matrix H(x,y), again with quadratic form entries, so that H(x,y) is a valid 
Hessian and A(x) is PSD if and only if H(x, y) is. In fact, we will directly construct 
the polynomial f(x,y) whose Hessian is the matrix H(x,y). This is done in the 
next theorem, which establishes the correctness of our main reduction. Once this 
theorem is proven, the proof of Theorem 2.1 will become immediate. 

Theorem 2.3. Given a biquadratic form b(x; y), define the the n x n polynomial 
matrix C(x,y) by setting 



(2.5) 
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and let 7 be the largest coefficient, in absolute value, of any monomial present in 
some entry of the matrix C(x,y). Let f be the form given by 

2 n n 

f(-r.y): b(x;y) • ( E ' ' • »/ • £ x]x) + £ tfvf). (2.6) 

j=l i=l ij'=l,...,n i,j=l,...,n 

i<j i<j 

Then, b(x; y) is psd if and only if f(x, y) is convex. 

Proof. Before we prove the claim, let us make a few observations and try to 
shed light on the intuition behind this construction. We will use H(x,y) to 
denote the Hessian of /. This is a 2n x 2n polynomial matrix whose entries are 
quadratic forms. The polynomial / is convex if and only if z T H(x, y)z is psd. For 
bookkeeping purposes, let us split the variables Z clS Z . — ( Zj. j Zy ) , where z x and 
z y each belong to W 1 . It will also be helpful to give a name to the second group 
of terms in the definition of f(x,y) in (2.6). So, let 

2 n n 

^):=^(5>* + 5> 4 + E *^ 2 + E y'yf)- ( 2 - 7 ) 

i=l i=l i,j=l,...,n i,j=l,...,n 

i<j i<j 

We denote the Hessian matrices of b(x,y) and g(x,y) with Hj,(x,y) and H g (x,y) 
respectively. Thus, H(x,y) = Ht>(x,y) + H g (x,y). Let us first focus on the 
structure of Hb(x,y). Observe that if we define 

dyidyj 

then A{x) depends only on x, and 

] 2 y T A{x)y = b{x ] y). (2.8) 

Similarly, if we let 

- d M 

then B(y) depends only on y, and 

±x T B(y)x = b(x;y). (2.9) 

From Eq. (2.8), we have that b(x;y) is psd if and only if A(x) is PSD; from Eq. 
(2.9), we see that b(x; y) is psd if and only if B(y) is PSD. 
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Putting the blocks together, we have 
H b (x,y) 



B(y) C(x,y) 
C T (x,y) A(x) 



(2.10) 



The matrix C(x,y) is not in general symmetric. The entries of C(x,y) consist 
of square-free monomials that are each a multiple of Xijjj for some i, j, with 
1 < i, j < n; (see (2.2) and (2.5)). 

The Hessian H g (x,y) of the polynomial g(x,y) in (2.7) is given by 



H g (x,y) 



2 

n 7 



H]\x) 







(2.11) 



where 



12x1 + 2 

i=i 

4X1X2 



i=l,...,n 



Axix 2 
12x 2 2 + 2 A 

i=l,...,n 
i+2 



Ax\X n 
4x 2 x n 



4xix r 



4x n _ix„ 12x 2 n + 2 ^ 



X; 



i=l,...,n 



(2.12) 



and 



Hf{y) 



\2y\ +2^j? 

i=l,...,n 
#1 

42/12/2 



42/12/2 
122/1 + 2 £ y] 

i=l,...,n 



%12/n 
42/22/n 



42/1 2/n 



42/„_i2/n 122/^+2 2/f 



i=l,...,n 



(2.13) 

Note that all diagonal elements of H] j 1 (x) and H 22 (y) contain the square of every 
variable and 2/1, • • • , 2/n respectively. 

We fist give an intuitive summary of the rest of the proof. If b(x; y) is not psd, 
then B(y) and A(x) are not PSD and hence H b (x, y) is not PSD. Moreover, adding 
H g (x,y) to H b (x,y) cannot help make H(x,y) PSD because the dependence of 
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the diagonal blocks of H b (x,y) and H g (x,y) on x and y runs backwards. On 
the other hand, if b(x;y) is psd, then H b (x,y) will have PSD diagonal blocks. 
In principle, H b {x,y) might still not be PSD because of the off-diagonal block 
C(x,y). However, the squares in the diagonal elements of H g (x,y) will be shown 
to dominate the monomials of C(x,y) and make H(x,y) PSD. 

Let us now prove the theorem formally. One direction is easy: if b(x; y) is not 
psd, then f(x,y) is not convex. Indeed, if there exist x and y in IR n such that 
b(x; y) < 0, then 



z T H(x,y)z 



z x =0,x=x,y=0,z y =y 



= y T A(x)y = 2b(x;y)<0. 



For the converse, suppose that b(x;y) is psd; we will prove that z T H(x,y)z is 
psd and hence f(x,y) is convex. We have 

z T H(x, y)z = z T y A{x)z y + z T x B{y)z x + 2z^C(x, y)z y 

+ n ^zlHl\x)z x + ^z T y Hf{y)z y . 

Because ZyA(x)z y and z x B(y)z x are psd by assumption (see (2.8) and (2.9)), it 
suffices to show that z T H(x,y)z — z y A(x)z y — z x B(y)z x is psd. In fact, we will 
show that z T H(x,y)z — ZyA(x)z y — z^B(y)z x is a sum of squares. 
After some regrouping of terms we can write 

z T H(x,y)z - ZyA(x)z y - z x B(y)z x =p 1 (x,y,z) + p 2 (x, z x ) + p 3 (y, z y ), (2.14) 

where 

n n n n 

Pl (x, y, z) = 2z T x C{x, y)z y + n 2 7 ( £ z%) ( £ xf) + n 2 7 ( £ ^) ( £ yf) , 

(2.15) 

(2.16) 



i=i 



i=i 



i=i 



i=i 



P 2 {x,z x ) = n 2 -fz 



5x 2 2X\X 2 

2x\X 2 5x1 



2x\X n 
2x 2 x n 



5x 2 n 



and 



2x\X n • • • 2x n —\X f 

5y\ 2y 1 y 2 

Zym 5y\ 

2y 1 y n ■■■ 2y n _ 1 y n 5y 2 n 



Z-xi 



ZyiVn 
2y2y n 



Zy. 



(2.17) 
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We show that (2.14) is sos by showing that p±, p 2 , and p 3 are each individually 
sos. To see that p 2 is sos, simply note that we can rewrite it as 

Pt.{x,z x ) = n 2 >y 

The argument for p% is of course identical. To show that p\ is sos, we argue as 
follows. If we multiply out the first term 2z x C(x,y)z y , we obtain a polynomial 
with monomials of the form 

±2(3 iJAl z X;k x i y j Zy ! i, (2.18) 
where < A,j,fc,z < 7? by the definition of 7. Since 
± 2p itjAl z Xik x i y j z y j + Pij^izl^xl + A,i,fe,/I/|4« = Pi,jAi( z x,kXi ± yjZ y ,if, (2.19) 

by pairing up the terms of 2z x C(x,y)z y with fractions of the squared terms 
z xk x f ano - z< yiV 2 ji we S e ^ a sum °f squares. Observe that there are more than 
enough squares for each monomial of 2z x C(x,y)z y because each such monomial 
±2(3ij : k,iZx,k%iyjZy,i occurs at most once, so that each of the terms z 2 k x 2 and 
z 2 (y 2 will be needed at most n 2 times, each time with a coefficient of at most 7. 
Therefore, p\ is sos, and this completes the proof. □ 

We can now complete the proof of strong NP-hardness of deciding convexity 
of quartic forms. 

Proof of Theorem 2.1. As we remarked earlier, deciding nonnegativity of biquadratic 
forms is known to be strongly NP-hard [97]. Given such a biquadratic form b(x; y), 
we can construct the polynomial f(x,y) as in (2.6). Note that f(x,y) has degree 
four and is homogeneous. Moreover, the reduction from b(x;y) to f(x,y) runs in 
polynomial time as we are only adding to b(x; y) 2n + 2(") new monomials with 

coefficient and the size of 7 is by definition only polynomially larger than 
the size of any coefficient of b(x;y). Since by Theorem 2.3 convexity of f(x,y) 
is equivalent to nonnegativity of b(x;y), we conclude that deciding convexity of 
quartic forms is strongly NP-hard. □ 

An algebraic version of the reduction. Before we proceed further with our results, 
we make a slight detour and present an algebraic analogue of this reduction, 
which relates sum of squares biquadratic forms to sos-convex polynomials. Both 
of these concepts are well-studied in the literature, in particular in regards to 
their connection to semidefinite programming; see, e.g., [97], [11], and references 
therein. 



k=l 



2 2 

,n, A, 



+ 2 



n 

^ ^ z x,k x k^ 



k=l 
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Definition 2.4. A polynomial p(x), with its Hessian denoted by H(x), is sos- 
convex if the polynomial y T H(x)y is a sum of squares in variables (x;y). 2 

Theorem 2.5. Given a biquadratic form b(x;y), let f(x,y) be the quartic form 
defined as in (2.6). Then b(x;y) is a sum of squares if and only if f(x,y) is 
sos-convex. 

Proof. The proof is very similar to the proof of Theorem 2.3 and is left to the 
reader. □ 

We will revisit Theorem 2.5 in the next chapter when we study the connection 
between convexity and sos-convexity. 

Some NP-hardness results, obtained as corollaries. NP-hardness of checking convexity 
of quartic forms directly establishes NP-hardness 3 of several problems of interest. 
Here, we mention a few examples. 

Corollary 2.6. It is NP-hard to decide nonnegativity of a homogeneous polyno- 
mial q of degree four, of the form 

q( x , y) = \pi x ) + \p{y) - v , 

for some homogeneous quartic polynomial p. 

Proof. Nonnegativity of q is equivalent to convexity of p, and the result follows 
directly from Theorem 2.1. □ 

Definition 2.7. A set S C M. n is basic closed semialgebraic if it can be written 
as 

S = {xe R n \ fi(x) > 0, i = 1, . . . ,m}, (2.20) 
for some positive integer m and some polynomials fi(x). 

Corollary 2.8. Given a basic closed semialgebraic set S as in (2.20), where at 
least one of the defining polynomials fi(x) has degree four, it is NP-hard to decide 
whether S is a convex set. 

Proof. Given a quartic polynomial p(x), consider the basic closed semialgebraic 
set 

E v = {(x,t) E R n+l \ t-p(x) > 0}, 

describing the epigraph of p(x). Since p(x) is convex if and only if its epigraph is 
a convex set, the result follows. 4 □ 



2 Three other equivalent definitions of sos-convexity are presented in the next chapter. 

3 All of our NP-hardness results in this chapter arc in the strong sense. For the sake of 
brevity, from now on we refer to strongly NP-hard problems simply as NP-hard problems. 

4 Another proof of this corollary is given by the NP-hardness of checking convexity of sublevel 
sets of quartic polynomials (Theorem 2.24 in Section 2.4.3). 
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Convexity of polynomials of even degree larger than four. We end this section by 
extending our hardness result to polynomials of higher degree. 

Corollary 2.9. It is NP-hard to check convexity of polynomials of any fixed even 
degree d > 4. 

Proof. We have already established the result for polynomials of degree four. 
Given such a degree four polynomial p(x) := p(x 1 , . . . ,x n ) and an even degree 
d > 6, consider the polynomial 

q(x,x n+1 ) = p(x)+x d n+l 

in n + 1 variables. It is clear (e.g., from the block diagonal structure of the Hessian 
of q) that p(x) is convex if and only if q(x) is convex. The result follows. □ 

I 2.3 Complexity of deciding strict convexity and strong convexity 
I 2.3.1 Definitions and basics 

Definition 2.10. A function f : W 1 — > K. is strictly convex if for all x ^ y and 

all X G (0, 1), we have 

f(Xx + (1 - X)y) < Xf(x) + (1 - X)f(y). (2.21) 

Definition 2.11. A twice differentiate function f : R" — > R is strongly convex 
if its Hessian H(x) satisfies 

H(x) h ml, (2.22) 

for a scalar m > and for all x. 

We have the standard implications 

strong convexity =^> strict convexity =^> convexity, (2.23) 
but none of the converse implications is true. 

■ 2.3.2 Degrees that are easy 

From the implications in (2.23) and our previous discussion, it is clear that odd 
degree polynomials can never be strictly convex or strongly convex. We cover the 
case of quadratic polynomials in the following straightforward proposition. 

Proposition 2.12. For a quadratic polynomial p(x) = \x T Qx + q T x + c, the 
notions of strict convexity and strong convexity are equivalent, and can be decided 
in polynomial time. 
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Proof. Strong convexity always implies strict convexity. For the reverse direction, 
assume that p(x) is not strongly convex. In view of (2.22), this means that the 
matrix Q is not positive definite. If Q has a negative eigenvalue, p(x) is not 
convex, let alone strictly convex. If Q has a zero eigenvalue, let x ^ be the 
corresponding eigenvector. Then p(x) restricted to the line from the origin to x 
is linear and hence not strictly convex. 

To see that these properties can be checked in polynomial time, note that 
p(x) is strongly convex if and only if the symmetric matrix Q is positive definite. 
By Sylvester's criterion, positive definiteness of an n x n symmetric matrix is 
equivalent to positivity of its n leading principal minors, each of which can be 
computed in polynomial time. □ 

I 2.3.3 Degrees that are hard 

With little effort, we can extend our NP-hardness result in the previous section 
to address strict convexity and strong convexity. 

Proposition 2.13. It is NP-hard to decide strong convexity of polynomials of 
any fixed even degree d > 4. 

Proof. We give a reduction from the problem of deciding convexity of quartic 
forms. Given a homogenous quartic polynomial p(x) := p(x±, . . . , x n ) and an 
even degree d > 4, consider the polynomial 



in n + 1 variables. We claim that p is convex if and only if q is strongly convex. 
Indeed, if p(x) is convex, then so is p(x) + x^ +l . Therefore, the Hessian of p(x) + 
x^ + i is PSD. On the other hand, the Hessian of the term \{x\ + • • • + x\ + x^ +1 ) 
is the identity matrix. So, the minimum eigenvalue of the Hessian of q(x, x n+1 ) is 
positive and bounded below by one. Hence, q is strongly convex. 

Now suppose that p(x) is not convex. Let us denote the Hessians of p and q 
respectively by H p and H q . Up is not convex, then there exists a point x G R n 
such that 

X miQ {H p {x)) < 0, 

where A min here denotes the minimum eigenvalue. Because p(x) is homogenous 
of degree four, we have 



for any scalar cel. Pick c large enough such that X m i n (H p (cx)) < 1. Then it 
is easy to see that H q (cx, 0) has a negative eigenvalue and hence q is not convex, 




(2.24) 




let alone strongly convex. 



□ 
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Remark 2.3.1. It is worth noting that homogeneous polynomials of degree d > 2 
can never be strongly convex (because their Hessians vanish at the origin). Not 
surprisingly, the polynomial q in the proof of Proposition 2.13 is not homogeneous. 

Proposition 2.14. It is NP-hard to decide strict convexity of polynomials of any 
fixed even degree d > 4. 

Proof. The proof is almost identical to the proof of Proposition 2.13. Let q be 
defined as in (2.24). If p is convex, then we established that q is strongly convex 
and hence also strictly convex. If p is not convex, we showed that q is not convex 
and hence also not strictly convex. □ 

H 2.4 Complexity of deciding quasiconvexity and pseudoconvexity 
I 2.4.1 Definitions and basics 

Definition 2.15. A function f : IR n — > R is quasiconvex if its sublevel sets 

S( a ) := {x e R n | f(x) < a}, (2.25) 

for all a G K, are convex. 

Definition 2.16. A differentiate function f : lR n — > R is pseudoconvex if the 
implication 

Vf(xf(y -x)>0 =► f(y) > f(x) (2.26) 
holds for all x and y inM. n . 

The following implications are well-known (see e.g. [25, p. 143]): 

convexity pseudoconvexity quasiconvexity, (2.27) 

but the converse of neither implication is true in general. 

■ 2.4.2 Degrees that are easy 

As we remarked earlier, linear polynomials are always convex and hence also 
pseudoconvex and quasiconvex. Unlike convexity, however, it is possible for poly- 
nomials of odd degree d > 3 to be pseudoconvex or quasiconvex. We will show in 
this section that somewhat surprisingly, quasiconvexity and pseudoconvexity of 
polynomials of any fixed odd degree can be decided in polynomial time. Before 
we present these results, we will cover the easy case of quadratic polynomials. 
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Proposition 2.17. For a quadratic polynomial p(x) = \x T Qx + q T x + c, the 
notions of convexity, pseudoconvexity , and quasiconvexity are equivalent, and can 
be decided in polynomial time. 

Proof. We argue that the quadratic polynomial p(x) is convex if and only if it 
is quasiconvex. Indeed, if p(x) is not convex, then Q has a negative eigenvalue; 
letting x be a corresponding eigenvector, we have that p(tx) is a quadratic poly- 
nomial in t, with negative leading coefficient, so p(tx) is not quasiconvex, as a 
function of t. This, however, implies that p(x) is not quasiconvex. 

We have already argued in Section 2.2.2 that convexity of quadratic polyno- 
mials can be decided in polynomial time. □ 

Quasiconvexity of polynomials of odd degree 

In this subsection, we provide a polynomial time algorithm for checking whether 
an odd-degree polynomial is quasiconvex. Towards this goal, we will first show 
that quasiconvex polynomials of odd degree have a very particular structure 
(Proposition 2.20). 

Our first lemma concerns quasiconvex polynomials of odd degree in one vari- 
able. The proof is easy and left to the reader. A version of this lemma is provided 
in [38, p. 99], though there also without proof. 

Lemma 2.18. Suppose that p(t) is a quasiconvex univariate polynomial of odd 
degree. Then, p{t) is monotonic. 

Next, we use the preceding lemma to characterize the complements of sublevel 
sets of quasiconvex polynomials of odd degree. 

Lemma 2.19. Suppose that p(x) is a quasiconvex polynomial of odd degree d. 
Then the set {x \ p(x) > a} is convex. 

Proof. Suppose not. In that case, there exist x,y,z such that z is on the line 
segment connecting x and y, and such that p{x),p(y) > a but p(z) < a. Consider 
the polynomial 

q(t) =p(x + t(y-x)). 

This is, of course, a quasiconvex polynomial with g(0) = p(x), q(l) = p(y), and 
q{t') = p(z), for some t' G (0, 1). If q(t) has degree d, then, by Lemma 2.18, it 
must be monotonic, which immediately provides a contradiction. 

Suppose now that q(t) has degree less than d. Let us attempt to perturb x to 
x + x', and y to y + y', so that the new polynomial 



q(t) = p (x + x' + t(y + y' - x - x')) 
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has the following two properties: (i) q(t) is a polynomial of degree d, and (ii) 
q(0) > q(t'), q(l) > q(t'). If such perturbation vectors x',y' can be found, then 
we obtain a contradiction as in the previous paragraph. 

To satisfy condition (ii), it suffices (by continuity) to take x',y' with \\x'\\, \\y'\\ 
small enough. Thus, we only need to argue that we can find arbitrarily small x', y' 
that satisfy condition (i). Observe that the coefficient of t d in the polynomial 
q(t) is a nonzero polynomial in x + x',y + y'; let us denote that coefficient as 
r(x + x',y + y'). Since r is a nonzero polynomial, it cannot vanish at all points 
of any given ball. Therefore, even when considering a small ball around (x, y) (to 
satisfy condition (ii)), we can find (x+x f , y+y') in that ball, with r(x+x', y+y') ^ 
0, thus establishing that the degree of q is indeed d. This completes the proof. 

□ 

We now proceed to a characterization of quasiconvex polynomials of odd de- 
gree. 

Proposition 2.20. Let p{x) be a polynomial of odd degree d. Then, p(x) is 
quasiconvex if and only if it can be written as 

p(x) = h(fx), (2.28) 

for some nonzero £ G M n 7 and for some monotonic univariate polynomial h(t) of 
degree d. If, in addition, we require the nonzero component of £ with the smallest 
index to be equal to unity, then £ and h(t) are uniquely determined by p(x). 

Proof. It is easy to see that any polynomial that can be written in the above 
form is quasiconvex. In order to prove the converse, let us assume that p(x) 
is quasiconvex. By the definition of quasiconvexity, the closed set S(a) = {x \ 
p(x) < a] is convex. On the other hand, Lemma 2.19 states that the closure 
of the complement of S(a) is also convex. It is not hard to verify that, as a 
consequence of these two properties, the set S(a) must be a halfspace. Thus, for 
any given a, the sublevel set S(a) can be written as {x \ ^{a) T x < c(a)} for some 
£(a) G K n and c(a) G K. This of course implies that the level sets {x \ p(x) = a} 
are hyperplanes of the form {x \ t;(a) T x = c(a)}. 

We note that the sublevel sets are necessarily nested: if a < f3, then S(a) C 
iS(/3). An elementary consequence of this property is that the hyperplanes must 
be collinear, i.e., that the vectors £(a) must be positive multiples of each other. 
Thus, by suitably scaling the coefficients c(a), we can assume, without loss of 
generality, that £(a) = £, for some £ G 1R™, and for all a. We then have that 
{x | p(x) = a} = {x | £ T x = c(a)}. Clearly, there is a one-to-one correspondence 
between a and c(a), and therefore the value of p(x) is completely determined by 
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£ T £. In particular, there exists a function h(t) such that p(x) = h(q T x). Since 
p(x) is a polynomial of degree d, it follows that h(t) is a univariate polynomial 
of degree d. Finally, we observe that if h(t) is not monotonic, then p(x) is not 
quasiconvex. This proves that a representation of the desired form exists. Note 
that by suitably scaling £, we can also impose the condition that the nonzero 
component of £ with the smallest index is equal to one. 

Suppose that now that p(x) can also be represented in the form p(x) = h(^ T x) 
for some other polynomial h(t) and vector £. Then, the gradient vector of p(x) 
must be proportional to both £ and £. The vectors £ and £ are therefore collinear. 
Once we impose the requirement that the nonzero component of £ with the small- 
est index is equal to one, we obtain that £ = £ and, consequently, h — h. This 
establishes the claimed uniqueness of the representation. □ 

Remark. It is not hard to see that if p(x) is homogeneous and quasiconvex, then 
one can additionally conclude that h(t) can be taken to be h(t) = t d , where d is 
the degree of p(x). 

Theorem 2.21. For any fixed odd degree d, the quasiconvexity of polynomials of 
degree d can be checked in polynomial time. 

Proof. The algorithm consists of attempting to build a representation of p(x) of 
the form given in Proposition 2.20. The polynomial p(x) is quasiconvex if and 
only if the attempt is successful. 

Let us proceed under the assumption that p(x) is quasiconvex. We differentiate 
p(x) symbolically to obtain its gradient vector. Since a representation of the form 
given in Proposition 2.20 exists, the gradient is of the form Vp(x) = £/i'(£ T x), 
where h'(t) is the derivative of h(t). In particular, the different components of 
the gradient are polynomials that are proportional to each other. (If they are not 
proportional, we conclude that p(x) is not quasiconvex, and the algorithm termi- 
nates.) By considering the ratios between different components, we can identify 
the vector £, up to a scaling factor. By imposing the additional requirement 
that the nonzero component of £ with the smallest index is equal to one, we can 
identify £ uniquely. 

We now proceed to identify the polynomial h(t). For k = 1, . . . , d + 1, we 
evaluate p(/c£), which must be equal to /i(£ T £/c). We thus obtain the values of 
h(t) at d + 1 distinct points, from which h(t) is completely determined. We then 
verify that h{^ T x) is indeed equal to p(x). This is easily done, in polynomial 
time, by writing out the 0{n d ) coefficients of these two polynomials in x and 
verifying that they are equal. (If they are not all equal, we conclude that p{x) is 
not quasiconvex, and the algorithm terminates.) 
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Finally, we test whether the above constructed univariate polynomial h is 
monotonic, i.e., whether its derivative h'(t) is either nonnegative or nonpositive. 
This can be accomplished, e.g., by quantifier elimination or by other well-known 
algebraic techniques for counting the number and the multiplicity of real roots of 
univariate polynomials; see [24]. Note that this requires only a constant number 
of arithmetic operations since the degree d is fixed. If h fails this test, then p{x) 
is not quasiconvex. Otherwise, our attempt has been successful and we decide 
that p(x) is indeed quasiconvex. □ 

Pseudoconvexity of polynomials of odd degree 

In analogy to Proposition 2.20, we present next a characterization of odd degree 
pseudoconvex polynomials, which gives rise to a polynomial time algorithm for 
checking this property. 

Corollary 2.22. Let p{x) be a polynomial of odd degree d. Then, p(x) is pseu- 
doconvex if and only if p(x) can be written in the form 



for some £ G M. n and some univariate polynomial h of degree d such that its 
derivative h'{t) has no real roots. 

Remark. Observe that polynomials h with h! having no real roots comprise a 
subset of the set of monotonic polynomials. 

Proof. Suppose that p(x) is pseudoconvex. Since a pseudoconvex polynomial is 
quasiconvex, it admits a representation h{^ T x) where h is monotonic. If h'(t) = 
for some t, then picking a = t ■ we have that Vp(a) = 0, so that by 

pseudoconvexity, p(x) is minimized at a. This, however, is impossible since an 
odd degree polynomial is never bounded below. Conversely, suppose p(x) can be 
represented as in Eq. (2.29). Fix some x,y, and define the polynomial u(t) = 
p(x + t(y — x)). Since u(t) = h(^ T x + t^ T {y — x)), we have that either (i) u(t) is 
constant, or (ii) u'(t) has no real roots. Now if Vp{x){y — x) > 0, then tt'(0) > 0. 
Regardless of whether (i) or (ii) holds, this implies that u'(t) > everywhere, so 



Corollary 2.23. For any fixed odd degree d, the pseudoconvexity of polynomials 
of degree d can be checked in polynomial time. 

Proof. This is a simple modification of our algorithm for testing quasiconvexity 
(Theorem 2.21). The first step of the algorithm is in fact identical: once we 
impose the additional requirement that the nonzero component of £ with the 




(2.29) 



that u(l) > u(0) or p(y) > p{x). 



□ 
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smallest index should be equal to one, we can uniquely determine the vector £ 
and the coefficients of the univariate polynomial h(t) that satisfy Eq. (2.29) . (If 
we fail, p(x) is not quasiconvex and hence also not pseudoconvex.) Once we have 
h(t), we can check whether h'(t) has no real roots e.g. by computing the signature 
of the Hermite form of h'(t); see [24]. 

□ 

Remark 2.4.1. Homogeneous polynomials of odd degree d > 3 are never pseu- 
doconvex. The reason is that the gradient of these polynomials vanishes at the 
origin, but yet the origin is not a global minimum since odd degree polynomials 
are unbounded below. 

I 2.4.3 Degrees that are hard 

The main result of this section is the following theorem. 

Theorem 2.24. It is NP-hard to check quasiconvexity/pseudoconvexity of degree 
four polynomials. This is true even when the polynomials are restricted to be 
homogeneous. 

In view of Theorem 2.1, which established NP-hardness of deciding convexity 
of homogeneous quartic polynomials, Theorem 2.24 follows immediately from the 
following result. 

Theorem 2.25. For a homogeneous polynomial p(x) of even degree d, the notions 
of convexity, pseudoconvexity , and quasiconvexity are all equivalent? 

We start the proof of this theorem by first proving an easy lemma. 

Lemma 2.26. Let p(x) be a quasiconvex homogeneous polynomial of even degree 
d > 2. Then p{x) is nonnegative. 

Proof. Suppose, to derive a contradiction, that there exist some e > and x G M. n 
such that p{x) = — e. Then by homogeneity of even degree we must have p(— x) = 
p(x) = — e. On the other hand, homogeneity of p implies that p(0) = 0. Since the 
origin is on the line between x and —x, this shows that the sublevel set S(— e) is 
not convex, contradicting the quasiconvexity of p. □ 

5 The result is more generally true for diffcrentiable functions that are homogeneous of even 
degree. Also, the requirements of homogeneity and having an even degree both need to be 
present. Indeed, x 3 and x 4 — 8x 3 + 18x 2 are both quasiconvex but not convex, the first being 
homogeneous of odd degree and the second being nonhomogeneous of even degree. 
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Proof of Theorem 2.25. We show that a quasiconvex homogeneous polynomial of 
even degree is convex. In view of implication (2.27), this proves the theorem. 

Suppose that p(x) is a quasiconvex polynomial. Define S = {x e M. n | p(x) < 
1}. By homogeneity, for any a G W 1 with p(a) > 0, we have that 

n 

eS. 



p(ay/ d 



By quasiconvexity, this implies that for any a, b with p(a),p(b) > 0, any point on 
the line connecting a/p(a) l l d and b/p{b) l l d is in S. In particular, consider 



a + b 

c = 



p(ay/ d + p(by/ d ' 

Because c can be written as 

p( a y/ d \ ( a \ f P (by/ d 



K p{ay/ d + P (by/ d J \p( a y/ d j \p(ay/ d + p{py/ d ) \p(by/ d / 

we have that c e S, i.e., p(c) < 1. By homogeneity, this inequality can be restated 
as 

p(a + b) < {p{ a yi d +p{by/ d ) d , 

and therefore 



,,!«)'/" +p(6)'/"V^ g(cO+p(6) | (23Q) 



where the last inequality is due to the convexity of x d . 

Finally, note that for any polynomial p, the set {x \ p(x) ^ 0} is dense in 
W 1 (here we again appeal to the fact that the only polynomial that is zero on 
a ball of positive radius is the zero polynomial); and since p is nonnegative due 
to Lemma 2.26, the set {x \ p(x) > 0} is dense in M n . Using the continuity 
of p, it follows that Eq. (2.30) holds not only when a, b satisfy p(a),p(b) > 0, 
but for all a, b. Appealing to the continuity of p again, we see that for all a, b, 
p(Xa + (1 - A)6) < Xp(a) + (1 - X)p(b), for all A e [0, 1]. This establishes that p 
is convex. 

□ 

Quasiconvexity/pseudoconvexity of polynomials of even degree larger than four. 

Corollary 2.27. It is NP-hard to decide quasiconvexity of polynomials of any 
fixed even degree d > 4. 
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Proof. We have already proved the result for d — 4. To establish the result 
for even degree d > 6, recall that we have established NP-hardness of deciding 
convexity of homogeneous quartic polynomials. Given such a quartic form p{x) : = 
p(xi, . . . , x n ), consider the polynomial 



We claim that q is quasiconvex if and only if p is convex. Indeed, if p is convex, 
then obviously so is q, and therefore q is quasiconvex. Conversely, if p is not 
convex, then by Theorem 2.25, it is not quasiconvex. So, there exist points 
a, b, c G M n , with c on the line connecting a and b, such that p(a) < 1, p(b) < 1, but 
p(c) > 1. Considering points (a, 0), (6, 0), (c, 0), we see that q is not quasiconvex. 
It follows that it is NP-hard to decide quasiconvexity of polynomials of even degree 
four or larger. □ 

Corollary 2.28. It is NP-hard to decide pseudoconvexity of polynomials of any 
fixed even degree d > 4. 

Proof. The proof is almost identical to the proof of Corollary 2.27. Let q be 
defined as in (2.31). If p is convex, then q is convex and hence also pseudocon- 
vex. If p is not convex, we showed that q is not quasiconvex and hence also not 
pseudo convex. □ 

I 2.5 Summary and conclusions 

In this chapter, we studied the computational complexity of testing convexity and 
some of its variants, for polynomial functions. The notions that we considered 
and the implications among them are summarized below: 

strong convexity => strict convexity => convexity pseudoconvexity => quasiconvexity. 

Our complexity results as a function of the degree of the polynomial are listed 
in Table 2.1. We gave polynomial time algorithms for checking pseudoconvexity 
and quasiconvexity of odd degree polynomials that can be useful in many ap- 
plications. Our negative results, on the other hand, imply (under P^NP) the 
impossibility of a polynomial time (or even pseudo-polynomial time) algorithm 
for testing any of the properties listed in Table 2.1 for polynomials of even degree 
four or larger. Although the implications of convexity are very significant in op- 
timization theory, our results suggest that unless additional structure is present, 
ensuring the mere presence of convexity is likely an intractable task. It is therefore 
natural to wonder whether there are other properties of optimization problems 




(2.31) 
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property vs. degree 


1 


2 


odd > 3 


even > 4 


strong convexity 


no 


P 


no 


strongly NP-hard 


strict convexity 


no 


P 


no 


strongly NP-hard 


convexity 


yes 


P 


no 


strongly NP-hard 


pseudoconvexity 


yes 


P 


P 


strongly NP-hard 


quasiconvexity 


yes 


P 


P 


strongly NP-hard 



Table 2.1. Summary of our complexity results. A yes (no) entry means that the question is 
trivial for that particular entry because the answer is always yes (no) independent of the input. 
By P, we mean that the problem can be solved in polynomial time. 

that share some of the attractive consequences of convexity, but are easier to 
check. 

The hardness results of this chapter also lay emphasis on the need for finding 
good approximation algorithms for recognizing convexity that can deal with a 
large number of instances. This is our motivation for the next chapter as we 
turn our attention to the study of algebraic counterparts of convexity that can be 
efficiently checked with semidefinite programming. 



Chapter 3 



Convexity and SOS-Convexity 



The overall contribution of this chapter is a complete characterization of the 
containment of the sets of convex and sos-convex polynomials in every degree and 
dimension. The content of this chapter is mostly based on the work in [9], but 
also includes parts of [11] and [2]. 

I 3.1 Introduction 

I 3.1.1 Nonnegativity and sum of squares 

One of the cornerstones of real algebraic geometry is Hilbert's seminal paper in 
1888 [77], where he gives a complete characterization of the degrees and dimen- 
sions in which nonnegative polynomials can be written as sums of squares of 
polynomials. In particular, Hilbert proves in [77] that there exist nonnegative 
polynomials that are not sums of squares, although explicit examples of such 
polynomials appeared only about 80 years later and the study of the gap between 
nonnegative and sums of squares polynomials continues to be an active area of 
research to this day. 

Motivated by a wealth of new applications and a modern viewpoint that em- 
phasizes efficient computation, there has also been a great deal of recent interest 
from the optimization community in the representation of nonnegative polyno- 
mials as sums of squares (sos). Indeed, many fundamental problems in applied 
and computational mathematics can be reformulated as either deciding whether 
certain polynomials are nonnegative or searching over a family of nonnegative 
polynomials. It is well-known however that if the degree of the polynomial is four 
or larger, deciding nonnegativity is an NP-hard problem. (As we mentioned in 
the last chapter, this follows e.g. as an immediate corollary of NP-hardness of 
deciding matrix copositivity [109].) On the other hand, it is also well-known that 
deciding whether a polynomial can be written as a sum of squares can be reduced 
to solving a semidefmite program, for which efficient algorithms e.g. based on 
interior point methods is available. The general machinery of the so-called "sos 
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relaxation" has therefore been to replace the intractable nonnegativity require- 
ments with the more tractable sum of squares requirements that obviously provide 
a sufficient condition for polynomial nonnegativity. 

Some relatively recent applications that sum of squares relaxations have found 
span areas as diverse as control theory [118], [76], quantum computation [59], 
polynomial games [120], combinatorial optimization [71], geometric theorem prov- 
ing [123], and many others. 

H 3.1.2 Convexity and sos-convexity 

Aside from nonnegativity, convexity is another fundamental property of poly- 
nomials that is of both theoretical and practical significance. In the previous 
chapter, we already listed a number of applications of establishing convexity of 
polynomials including global optimization, convex envelope approximation, Lya- 
punov analysis, data fitting, defining norms, etc. Unfortunately, however, we 
also showed that just like nonnegativity, convexity of polynomials is NP-hard to 
decide for polynomials of degree as low as four. Encouraged by the success of 
sum of squares methods as a viable substitute for nonnegativity, our focus in this 
chapter will be on the analogue of sum of squares for polynomial convexity: a 
notion known as sos-convexity. 

As we mentioned in our previous chapters in passing, sos-convexity (which 
gets its name from the work of Helton and Nie in [75]) is a sufficient condition 
for convexity of polynomials based on an appropriately defined sum of squares 
decomposition of the Hessian matrix; see the equivalent Definitions 2.4 and 3.4. 
The main computational advantage of sos-convexity stems from the fact that 
the problem of deciding whether a given polynomial is sos-convex amounts to 
solving a single semidefinite program. We will explain how this is exactly done 
in Section 3.2 of this chapter where we briefly review the well-known connection 
between sum of squares decomposition and semidefinite programming. 

Besides its computational implications, sos-convexity is an appealing concept 
since it bridges the geometric and algebraic aspects of convexity. Indeed, while 
the usual definition of convexity is concerned only with the geometry of the epi- 
graph, in sos-convexity this geometric property (or the nonnegativity of the Hes- 
sian) must be certified through a "simple" algebraic identity, namely the sum 
of squares factorization of the Hessian. The original motivation of Helton and 
Nie for defining sos-convexity was in relation to the question of semidefinite rep- 
resentability of convex sets [75]. But this notion has already appeared in the 
literature in a number of other settings [89], [90], [100], [44]. In particular, there 
has been much recent interest in the role of convexity in semialgebraic geometry 
[89], [26], [55], [91] and sos-convexity is a recurrent figure in this line of research. 
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I 3.1.3 Contributions and organization of this chapter 

The main contribution of this chapter is to establish the counterpart of Hilbert's 
characterization of the gap between nonnegativity and sum of squares for the 
notions of convexity and sos-convexity. We start by presenting some background 
material in Section 3.2. In Section 3.3, we prove an algebraic analogue of a classical 
result in convex analysis, which provides three equivalent characterizations for sos- 
convexity (Theorem 3.5). This result substantiates the fact that sos-convexity is 
the right sos relaxation for convexity. In Section 3.4, we present two explicit 
examples of convex polynomials that are not sos-convex, one of them being the 
first known such example. In Section 3.5, we provide the characterization of 
the gap between convexity and sos-convexity (Theorem 3.8 and Theorem 3.9). 
Subsection 3.5.1 includes the proofs of the cases where convexity and sos-convexity 
are equivalent and Subsection 3.5.2 includes the proofs of the cases where they 
are not. In particular, Theorem 3.16 and Theorem 3.17 present explicit examples 
of convex but not sos-convex polynomials that have dimension and degree as low 
as possible, and Theorem 3.18 provides a general construction for producing such 
polynomials in higher degrees. Some concluding remarks and an open problem 
are presented in Section 3.6. 

This chapter also includes two appendices. In Appendix A, we explain how 
the first example of a convex but not sos-convex polynomial was found with 
software using sum of squares programming techniques and the duality theory of 
semidefinite optimization. As a byproduct of this numerical procedure, we obtain 
a simple method for searching over a restricted family of nonnegative polynomials 
that are not sums of squares. In Appendix B, we give a formal (computer assisted) 
proof of validity of one of our minimal convex but not sos-convex polynomials. 

I 3.2 Preliminaries 

I 3.2.1 Background on nonnegativity and sum of squares 

For the convenience of the reader, we recall some basic concepts from the previous 
chapter and then introduce some new ones. We will be concerned throughout 
this chapter with polynomials with real coefficients. The ring of polynomials 
in n variables with real coefficients is denoted by A polynomial p is said 

to be nonnegative or positive semidefinite (psd) if p(x) > for all x G W 1 . 
We say that p is a sum of squares (sos), if there exist polynomials qi,...,q m 
such that p = Yl^iQi- We denote the set of psd (resp. sos) polynomials in n 
variables and degree d by P n> d (resp. £ n ,<z)- Any sos polynomial is clearly psd, 
so we have C P n d . Recall that a homogeneous polynomial (or a form) is a 



54 



CHAPTER 3. CONVEXITY AND SOS-CONVEXITY 



polynomial where all the monomials have the same degree. A form p of degree d is 
a homogeneous function of degree d since it satisfies p(Xx) = X d p(x) for any scalar 
A £ 1. We say that a form p is positive definite if p(x) > for all x ^ in W 1 . 
Following standard notation, we denote the set of psd (resp. sos) homogeneous 
polynomials in n variables and degree d by P n ^ (resp. S„ 5 d). Once again, we 
have the obvious inclusion £ M C P nd . All of the four sets E„ jd , P n>d , t n4 , P n4 
are closed convex cones. The closedness of the sum of squares cone may not be 
so obvious. This fact was first proved by Robinson [141]. We will make crucial 
use of it in the proof of Theorem 3.5 in the next section. 

Any form of degree d in n variables can be "dehomogenized" into a polynomial 
of degree < d in n — 1 variables by setting x n — 1. Conversely, any polynomial 
p of degree d in n variables can be "homogenized" into a form p^ of degree d in 
ra + l variables, by adding a new variable y, and letting 

p h (xi, ...,x n ,y) := y d p(x 1 /y, . . .,x n /y) . 

The properties of being psd and sos are preserved under homogenization and 
dehomogenization [138]. 

A very natural and fundamental question that as we mentioned earlier was 
answered by Hilbert is to understand in what dimensions and degrees nonnegative 
polynomials (or forms) can be represented as sums of squares, i.e, for what values 
of n and d we have S„ id = P n ^ or E nid = P n ,d- Note that because of the argument 
in the last paragraph, we have E nid = P Uyd if and only if S n+ i jd = P n +\,d- Hence, 
it is enough to answer the question just for polynomials or just for forms and the 
answer to the other one comes for free. 

Theorem 3.1 (Hilbert, [77]). S„ )d = P n ^ if and only if n = 1 or d = 2 or 
(n,d) = (2,4). Equivalently, S njd = P n ^ if and only if n = 2 or d = 2 or 
(n,d) = (3,4). 

The proofs of Si jd = Pi >d and S„ i2 = P n ,2 are relatively simple and were 
known before Hilbert. On the other hand, the proof of the fairly surprising 
fact that S 2i 4 = P 2 ,4 (or equivalently S 3i4 = P34) is rather involved. We refer 
the interested reader to [130], [128], [46], and references in [138] for some modern 
expositions and alternative proofs of this result. Hilbert 's other main contribution 
was to show that these are the only cases where nonnegativity and sum of squares 
are equivalent by giving a nonconstructive proof of existence of polynomials in 
-^2,6 \ ^2,6 an d ^3,4 \ S 3 4 (or equivalently forms in P 3 6 \ E 3 6 and P 4i4 \ S 4 4 ). From 
this, it follows with simple arguments that in all higher dimensions and degrees 
there must also be psd but not sos polynomials; see [138]. Explicit examples 
of such polynomials appeared in the 1960s starting from the celebrated Motzkin 
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form [107]: 

M(xi,x 2 , x 3 ) = x\x\ + x\x\ — "ix\x\x\ + X3, (3.1) 

which belongs to P^q \ S 3i6 , and continuing a few years later with the Robinson 
form [141]: 

R(xi,X 2 ,X3, 2 4 ) = x\ (xi-X4) 2 +X2(x2-X4) 2 +x|(x 3 -X4) 2 +2xiX2X3(xi+X 2 + X3-2x4), 

(3.2) 

which belongs to \ S 4i4 . 

Several other constructions of psd polynomials that are not sos have appeared 
in the literature since. An excellent survey is [138]. See also [139] and [27]. 

I 3.2.2 Connection to semidefinite programming and matrix generalizations 

As we remarked before, what makes sum of squares an appealing concept from a 
computational viewpoint is its relation to semidefinite programming. It is well- 
known (see e.g. [118], [119]) that a polynomial p in n variables and of even degree 
d is a sum of squares if and only if there exists a positive semidefinite matrix Q 
(often called the Gram matrix) such that 

p(x) = z T Qz, 

where z is the vector of monomials of degree up to d/2 

z = [l,x 1 ,x 2 , ■ ■ .,x n ,xix 2 , • • • ,x d J 2 }. (3.3) 

The set of all such matrices Q is the feasible set of a semidefinite program (SDP). 
For fixed d, the size of this semidefinite program is polynomial in n. Semidefinite 
programs can be solved with arbitrary accuracy in polynomial time. There are 
several implementations of semidefinite programming solvers, based on interior 
point algorithms among others, that are very efficient in practice and widely 
used; see [162] and references therein. 

The notions of positive semidefiniteness and sum of squares of scalar polyno- 
mials can be naturally extended to polynomial matrices, i.e., matrices with entries 
in M[x]. We say that a symmetric polynomial matrix U(x) £ lR[x] mxm is positive 
semidefinite if U(x) is positive semidefinite in the matrix sense for all x £ M. n , i.e, 
if U(x) has nonnegative eigenvalues for all x £ M n . It is straightforward to see 
that this condition holds if and only if the polynomial y T U(x)y in m + n vari- 
ables [x; y] is psd. A homogeneous polynomial matrix U (x) is said to be positive 
definite, if it is positive definite in the matrix sense, i.e., has positive eigenvalues, 
for all x in W 1 . The definition of an sos-matrix is as follows [88], [62], [152]. 
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Definition 3.2. A symmetric polynomial matrix U(x) G IRfrr]" 1 *" 1 , x G IR n , is an 
sos-matrix if there exists a polynomial matrix V(x) G M[a;] sxm for some s G N, 
such that U(x) = V T (x)V(x). 

It turns out that a polynomial matrix U(x) G IRfrr]" 1 *" 1 , x G IR n , is an 
sos-matrix if and only if the scalar polynomial y U(x)y is a sum of squares in 
M[x; y]; see [88]. This is a useful fact because in particular it gives us an easy way 
of checking whether a polynomial matrix is an sos-matrix by solving a semidef- 
inite program. Once again, it is obvious that being an sos-matrix is a sufficient 
condition for a polynomial matrix to be positive semidefinite. 

H 3.2.3 Background on convexity and sos-convexity 

A polynomial p is (globally) convex if for all x and y in R n and all A G [0,1], we 
have 

p{\x + (1 - \)y) < Xp(x) + (1 - A)p(y). (3.4) 

Since polynomials are continuous functions, the inequality in (3.4) holds if and 
only if it holds for a fixed value of A G (0, 1), say, A = \. In other words, p is 
convex if and only if 

p{\x+\y)<\p(x) + \p(y) (3.5) 

for all x and y; see e.g. [148, p. 71]. Recall from the previous chapter that except 
for the trivial case of linear polynomials, an odd degree polynomial is clearly never 
convex. 

For the sake of direct comparison with a result that we derive in the next 
section (Theorem 3.5), we recall next a classical result from convex analysis on 
the first and second order characterization of convexity. The proof can be found 
in many convex optimization textbooks, e.g. [38, p. 70]. The theorem is of course 
true for any twice differentiable function, but for our purposes we state it for 
polynomials. 

Theorem 3.3. Let p := p(x) be a polynomial. Let Vp := Vp{x) denote its 
gradient and let H := H(x) be its Hessian, i.e., the n x n symmetric matrix of 
second derivatives. Then the following are equivalent. 

(a) p (|x + \y) < \p{x) + \p{y), Vx,y G W 1 ; (i.e., p is convex). 

(b) p{y) > p(x) + Vp\x) T {y -x), Vx, y G W 1 . 

(c) y T H(x)y > 0, Vx,y G lR n ; (i.e., H{x) is a positive semidefinite polyno- 
mial matrix). 

Helton and Nie proposed in [75] the notion of sos-convexity as an sos relaxation 
for the second order characterization of convexity (condition (c) above). 
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Definition 3.4. A polynomial p is sos-convex if its Hessian H := H{x) is an 
sos-matrix. 

With what we have discussed so far, it should be clear that sos-convexity 
is a sufficient condition for convexity of polynomials that can be checked with 
semidefinite programming. In the next section, we will show some other natural 
sos relaxations for polynomial convexity, which will turn out to be equivalent to 
sos-convexity. 

We end this section by introducing some final notation: C n ^ and HC n ^ will 
respectively denote the set of convex and sos-convex polynomials in n variables 
and degree d; C ny d and SC„ 5 d will respectively denote set of convex and sos- 
convex homogeneous polynomials in n variables and degree d. Again, these four 
sets are closed convex cones and we have the obvious inclusions £C nj< j C C n d and 

I 3.3 Equivalent algebraic relaxations for convexity of polynomials 

An obvious way to formulate alternative sos relaxations for convexity of polyno- 
mials is to replace every inequality in Theorem 3.3 with its sos version. In this 
section we examine how these relaxations relate to each other. We also comment 
on the size of the resulting semidefinite programs. 

Our result below can be thought of as an algebraic analogue of Theorem 3.3. 

Theorem 3.5. Let p := p(x) be a polynomial of degree d in n variables with its 
gradient and Hessian denoted respectively by Vp := Vp(x) and H := H(x). Let 
gx, #V; and g V 2 be defined as 

9\(x,y) = (l-X)p(x) + Xp(y)-p((l-X)x + Xy), 

gv(x,y) = p(y) - p(x) -Vp(x) T (y - x), (3.6) 

9v*(x,y) = y T H(x)y. 

Then the following are equivalent: 

(a) gi(x,y) is sos 1 . 

(b) 9v(x,y) is sos. 

(c) g V 2(x,y) is sos; (i.e., H(x) is an sos-matrix). 

Proof. (a)=^(b): Assume gi is sos. We start by proving that gi_ will also be sos 
for any integer k > 2. A little bit of straightforward algebra yields the relation 

^^The constant | in gi(x,y) of condition (a) is arbitrary and is chosen for convenience. One 
can show that gi being sos implies that g\ is sos for any fixed A € [0, 1]. Conversely, if g\ is sos 
for some A G (0, 1), then gi is sos. The proofs are similar to the proof of (a)=>(b). 
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The second term on the right hand side of (3.7) is always sos because gi is sos. 
Hence, this relation shows that for any k, if gj_ is sos, then so is g i . Since for 

2~k~ 2 k + 1 

k = 1, both terms on the right hand side of (3.7) are sos by assumption, induction 
immediately gives that gj_ is sos for all k. 
Now, let us rewrite g\ as 

g x (x, y) = p(x) + X(p(y) - p{x)) - p(x + X(y - x)). 

We have 

9x(x,y) p(x + X(y - x)) - p(x) 

— ^ — =p(y)-p(x) . (3.8) 

Next, we take the limit of both sides of (3.8) by letting A = — > as k — > oo. 
Because p is differentiable, the right hand side of (3.8) will converge to gy. On 
the other hand, our preceding argument implies that % is an sos polynomial 
(of degree d in 2n variables) for any A = p-. Moreover, as A goes to zero, the 
coefficients of ^ remain bounded since the limit of this sequence is g^, which 
must have bounded coefficients (see (3.6)). By closedness of the sos cone, we 
conclude that the limit gv must be sos. 

(b)=^(a): Assume gy is sos. It is easy to check that 

gi (x, y) = \g v (\x + |y, x) + \g v (\x + |y, y) , 

and hence qi is sos. 

(b) ^>(c): Let us write the second order Taylor approximation of p around x: 

P(y) = p(x) + V T p(x)(y - x) 

+ \{y-x) T H{x){y-x) + o{\\y-x\\*). 

After rearranging terms, letting y = x + ez (for e > 0), and dividing both sides 
by e 2 we get: 

(p(x + ez) -p{x))/e 2 - V T p(x)z/e = ]-z T H(x)z + 1/ e 2 o(e 2 \\z\\ 2 ). (3.9) 

The left hand side of (3.9) is g\?(x,x + ez)/e 2 and therefore for any fixed e > 0, 
it is an sos polynomial by assumption. As we take e — > 0, by closedness of the 
sos cone, the left hand side of (3.9) converges to an sos polynomial. On the other 
hand, as the limit is taken, the term ^-o(e 2 ||^|| 2 ) vanishes and hence we have that 
z T H(x)z must be sos. 

(c) =^(b): Following the strategy of the proof of the classical case in [160, p. 
165], we start by writing the Taylor expansion of p around x with the integral 
form of the remainder: 

p{y)=p{x)+V T p{x){y-x)+ [ (l- t )(y-x) T H(x + t(y-x))(y-x)dt. (3.10) 

Jo 
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Since y T H(x)y is sos by assumption, for any t G [0, 1] the integrand 

(1 -t)(y- x) T H(x + t(y - x))(y - x) 
is an sos polynomial of degree d in x and y. From (3.10) we have 

gv= [ (l-t)(y-x) T H(x + t(y-x))(y-x)dt. 
Jo 

It then follows that gy is sos because integrals of sos polynomials, if they exist, 
are sos. 

□ 

We conclude that conditions (a), (b), and (c) are equivalent sufficient condi- 
tions for convexity of polynomials, and can each be checked with a semidefinite 
program as explained in Subsection 3.2.2. It is easy to see that all three polynomi- 
als gi (x, y), <7v(x, y), and gv 2 ( x , y) are polynomials in 2n variables and of degree d. 
(Note that each differentiation reduces the degree by one.) Each of these polyno- 
mials have a specific structure that can be exploited for formulating smaller SDPs. 
For example, the symmetries gi(x,y) = gi(y,x) and g V 2(x,—y) = g^2(x,y) can 
be taken advantage of via symmetry reduction techniques developed in [62]. 

The issue of symmetry reduction aside, we would like to point out that formu- 
lation (c) (which was the original definition of sos-convexity) can be significantly 
more efficient than the other two conditions. The reason is that the polynomial 
#v 2 ( x , y) is always quadratic and homogeneous in y and of degree d — 2 in x. This 
makes g^(x, y) much more sparse than g^(x, y) and g^(x, y), which have degree 
d both in x and in y. Furthermore, because of the special bipartite structure of 
y T H(x)y, only monomials of the form xfyj will appear in the vector of monomials 
(3.3). This in turn reduces the size of the Gram matrix, and hence the size of the 
SDP. It is perhaps not too surprising that the characterization of convexity based 
on the Hessian matrix is a more efficient condition to check. After all, this is a 
local condition (curvature at every point in every direction must be nonnegative), 
whereas conditions (a) and (b) are both global. 

Remark 3.3.1. There has been yet another proposal for an sos relaxation for 
convexity of polynomials in [44] . However, we have shown in [8] that the condition 
in [44] is at least as conservative as the three conditions in Theorem 3.5 and also 
significantly more expensive to check. 

Remark 3.3.2. Just like convexity, the property of sos-convexity is preserved un- 
der restrictions to affine subspaces. This is perhaps most directly seen through 
characterization (a) of sos-convexity in Theorem 3.5, by also noting that sum of 
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squares is preserved under restrictions. Unlike convexity however, if a polynomial 
is sos-convex on every line (or even on every proper affine subspace), this does 
not imply that the polynomial is sos-convex. 

As an application of Theorem 3.5, we use our new characterization of sos- 
convexity to give a short proof of an interesting lemma of Helton and Nie. 

Lemma 3.6. (Helton and Nie [75, Lemma 8]). Every sos-convex form is sos. 

Proof. Let p be an sos-convex form of degree d. We know from Theorem 3.5 that 
sos-convexity of p is equivalent to the polynomial gi(x,y) = \p{x) + \p{y) — 

P {\ x + \v) being sos. But since sos is preserved under restrictions and p(0) = 0, 
this implies that 

gi(x,0) = ±p(x)-p(lx) = (l-(\)*)p(x) 
is sos. □ 
Note that the same argument also shows that convex forms are psd. 



I 3.4 Some constructions of convex but not sos-convex polynomials 

It is natural to ask whether sos-convexity is not only a sufficient condition for 
convexity of polynomials but also a necessary one. In other words, could it be 
the case that if the Hessian of a polynomial is positive semidefinite, then it must 
factor? To give a negative answer to this question, one has to prove existence 
of a convex polynomial that is not sos-convex, i.e, a polynomial p for which one 
(and hence all) of the three polynomials g±,gv, and #v 2 m (3-6) are psd but not 
sos. Note that existence of psd but not sos polynomials does not imply existence 
of convex but not sos-convex polynomials on its own. The reason is that the 
polynomials gi,g^, and #v 2 & U possess a very special structure. 2 For example, 
y T H(x)y has the structure of being quadratic in y and a Hessian in x. (Not every 
polynomial matrix is a valid Hessian.) The Motzkin or the Robinson polynomials 
in (3.1) and (3.2) for example are clearly not of this structure. 

2 There are many situations where requiring a specific structure on polynomials makes psd 
equivalent to sos. As an example, we know that there are forms in P4.4 \ X4.4. However, if we 
require the forms to have only even monomials, then all such nonnegative forms in 4 variables 
and degree 4 are sums of squares [57]. 
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I 3.4.1 The first example 

In [11], [7], we presented the first example of a convex polynomial that is not 
sos-convex 3 : 

p(xi ,x 2 ,x 3 ) = S2xl + 118x^x1 + AOx^xj + 25xj x\ - &x\x\x\ - Zhx\x\ 
+3x\x A 2 xl - \§x\x\x\ + 2kx\x\ + 16x1 + ^x\x\ + l§x\x\ 
+<ctix\x% + 30xg. 

(3.H) 

As we will see later in this chapter, this form which lives in C^g \ SC^g turns 
out to be an example in the smallest possible number of variables but not in the 
smallest degree. 

In Appendix A, we will explain how the polynomial in (3.11) was found. The 
proof that this polynomial is convex but not sos-convex is omitted and can be 
found in [11]. However, we would like to highlight an idea behind this proof that 
will be used again in this chapter. As the following lemma demonstrates, one 
way to ensure a polynomial is not sos-convex is by enforcing one of the principal 
minors of its Hessian matrix to be not sos. 

Lemma 3.7. If P(x) e M[a;] m ' xm is an sos-matrix, then all its 2 m — 1 principal 
minors 4 " are sos polynomials. In particular, det(P) and the diagonal elements of 
P must be sos polynomials. 

Proof. We first prove that det(P) is sos. By Definition 3.2, we have P(x) = 
M T (x)M(x) for some s x m polynomial matrix M(x). If s — m, we have 

det(P) = det(M T )det(M) = (det(M)) 2 

and the result is immediate. If s > m, the result follows from the Cauchy-Binet 

3 Assuming P^NP, and given the NP-hardness of deciding polynomial convexity proven in the 
previous chapter, one would expect to see convex polynomials that are not sos-convex. However, 
we found the first such polynomial before we had proven the NP-hardness result. Moreover, 
from complexity considerations, even assuming P^NP, one cannot conclude existence of convex 
but not sos-convex polynomials for any fixed finite value of the number of variables n. 

4 The principal minors of an m x m matrix A are the determinants of all k x k (1 < k < m) 
sub- blocks whose rows and columns come from the same index set S C {1, ... , m}. 
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formula 5 . We have 

det(P) = £ 5 det(M T ) s det(M 5 ) 
= £ 5 det(M s fdet(M 5 ) 
= Es(det(M s )) 2 . 

Finally, when s < m, det(P) is zero which is trivially sos. In fact, the Cauchy- 
Binet formula also holds for s = m and s < m, but we have separated these cases 
for clarity of presentation. 

Next, we need to prove that the minors corresponding to smaller principal 
blocks of P are also sos. Define j\4 — {1, . . . , m}, and let I and J be nonempty 
subsets of Ai. Denote by Pjj a sub-block of P with row indices from I and 
column indices from J. It is easy to see that 

Pjj = {M T ) JM M MJ = {M MJ ) T M MJ . 

Therefore, Pjj is an sos- matrix itself. By the proceeding argument det (Pjj) must 
be sos, and hence all the principal minors are sos. □ 

Remark 3.4.1. Interestingly, the converse of Lemma 3.7 does not hold. A coun- 
terexample is the Hessian of the form / in (3.15) that we will present in the next 
section. All 7 principal minors of the 3x3 Hessian this form are sos polynomials, 
even though the Hessian is not an sos-matrix. This is in contrast with the fact 
that a polynomial matrix is positive semidefinite if and only if all its principal 
minors are psd polynomials. The latter statement follows immediately from the 
well-known fact that a constant matrix is positive semidefinite if and only if all 
its principal minors are nonnegative. 

■ 3.4.2 A "clean" example 

We next present another example of a convex but not sos-convex form whose 
construction is in fact related to our proof of NP-hardness of deciding convexity 
of quartic forms from Chapter 2. The example is in C 6i4 \ £C 6i4 and by contrast 

5 Given matrices A and B of size m x s and s x m respectively, the Cauchy-Binet formula 
states that 

det(AB) = dct( A s ) dct(B s ), 

s 

where S is a subset of {1, . . . , s} with m elements, As denotes the mxm matrix whose columns 
are the columns of A with index from 5, and similarly Bs denotes the mxm matrix whose 
rows are the rows of B with index from S. 
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to the example of the previous subsection, it will turn out to be minimal in the 
degree but not in the number of variables. What is nice about this example is 
that unlike the other examples in this chapter it has not been derived with the 
assistance of a computer and semidefinite programming: 

q(x 1 ,...,x 6 ) = x\ + x\ + x\ + x\ + x\ + x\ 

I O ( rf* 2 ,-y-J 2 I ,yj 2 ,-y, 2 I ,-y» 2 ™ 2 I ,-y-J 2 ™ 2 I ,yj 2 ™ 2 I ,-yj 2 ™ 2 \ 

r sLtyj^^j^^ \ ^ x 3 r ^ 2 3 ~ x ^ x r~ x ^ x g p x ^ x g y 

; " ' '[ " ^ ' ' ' ' (3-12) 

I J_ ( 2 ™ 2 I ,-v» 2 ~ 2 I ,-y-j 2 ™ 2 \ I ,->-» 2 ™ 2 i ™ 2 ™ 2 i ™ 2 ,y» 2 
n^2\ 1 4 2 5 "t" 3 6 / 1 6 ^24 "t" 3 5 

-(xiX 2 X4X 5 + XiX 3 X 4 X 6 + X 2 X 3 X 5 Xq). 

The proof that this polynomial is convex but not sos-convex can be extracted 
from Theorems 2.3 and 2.5 of Chapter 2. The reader can observe that these two 
theorems put together give us a general procedure for producing convex but not 
sos-convex quartic forms from any example of a psd but not sos biquadratic form 6 . 
The biquadratic form that has led to the form above is that of Choi in [45] . 

The example in (3.12) also shows that convex forms that possess strong sym- 
metry properties can still fail to be sos-convex. The symmetries in this form 
are inherited from the rich symmetry structure of the biquadratic form of Choi 
(see [62]). In general, symmetries are of interest in the study of positive semidef- 
inite and sums of squares polynomials because the gap between psd and sos can 
often behave very differently depending on the symmetry properties; see e.g. [28]. 

I 3.5 Characterization of the gap between convexity and sos-convexity 

Now that we know there exist convex polynomials that are not sos-convex, our 
final and main goal is to give a complete characterization of the degrees and 
dimensions in which such polynomials can exist. This is achieved in the next 
theorem. 

Theorem 3.8. TiC n>d = C H:d if and only if n = 1 or d = 2 or (n,d) = (2, 4). 

We would also like to have such a characterization for homogeneous polynomi- 
als. Although convexity is a property that is in some sense more meaningful for 
nonhomogeneous polynomials than for forms, one motivation for studying con- 
vexity of forms is in their relation to norms [140]. Also, in view of the fact that 

6 The reader can refer to Definition 2.2 of the previous chapter to recall the definition of a 
biquadratic form. 
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we have a characterization of the gap between nonnegativity and sums of squares 
both for polynomials and for forms, it is very natural to inquire the same result 
for convexity and sos- convexity. The next theorem presents this characterization 
for forms. 

Theorem 3.9. £C n ,(2 = C n ,d if and only if n = 2 or d = 2 or (n,d) — (3, 4). 

The result SC3 4 = C3 4 of this theorem is to be presented in full detail in [2]. 
The remainder of this chapter is solely devoted to the proof of Theorem 3.8 and 
the proof of Theorem 3.9 except for the case (n, d) = (3,4). Before we present 
these proofs, we shall make two important remarks. 

Remark 3.5.1. Difficulty with homogenization and dehomogenization. 

Recall from Subsection 3.2.1 and Theorem 3.1 that characterizing the gap between 
nonnegativity and sum of squares for polynomials is equivalent to accomplishing 
this task for forms. Unfortunately, the situation is more complicated for convex- 
ity and sos-convexity and that is the reason why we are presenting Theorems 3.8 
and 3.9 as separate theorems. The difficulty arises from the fact that unlike 
nonnegativity and sum of squares, convexity and sos-convexity are not always 
preserved under homogenization. (Or equivalently, the properties of being not 
convex and not sos-convex are not preserved under dehomogenization.) In fact, 
any convex polynomial that is not psd will no longer be convex after homogeniza- 
tion. This is because convex forms are psd but the homogenization of a non-psd 
polynomial is a non-psd form. Even if a convex polynomial is psd, its homoge- 
nization may not be convex. For example the univariate polynomial 10x^ — 5x1 + 2 
is convex and psd, but its homogenization lOxf — 5xiX% + 1x\ is not convex. 7 To 
observe the same phenomenon for sos-convexity, consider the trivariate form p 
in (3.11) which is convex but not sos-convex and define p(x 2 ,x 3 ) = p(l,x 2 ,x 3 ). 
Then, one can check that p is sos-convex (i.e., its 2 x 2 Hessian factors) even 
though its homogenization which is p is not sos-convex [11]. 

Remark 3.5.2. Resemblance to the result of Hilbert. The reader may have 
noticed from the statements of Theorem 3.1 and Theorems 3.8 and 3.9 that the 
cases where convex polynomials (forms) are sos-convex are exactly the same cases 
where nonnegative polynomials are sums of squares! We shall emphasize that 
as far as we can tell, our results do not follow (except in the simplest cases) 
from Hilbert's result stated in Theorem 3.1. Note that the question of convexity 
or sos-convexity of a polynomial p(x) in n variables and degree d is about the 
polynomials gi (x, y), ffy(x, y), or g^2(x,y) defined in (3.6) being psd or sos. Even 
though these polynomials still have degree d, it is important to keep in mind that 

7 What is true however is that a nonnegative form of degree d is convex if and only if the 
d-th root of its dehomogenization is a convex function [140, Prop. 4.4]. 
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they are polynomials in 2n variables. Therefore, there is no direct correspondence 
with the characterization of Hilbert. To make this more explicit, let us consider 
for example one particular claim of Theorem 3.9: SC2,4 = 6*2,4. For a form p 
in 2 variables and degree 4, the polynomials g±,gv, and g V 2 wm be forms in 4 
variables and degree 4. We know from Hilbert's result that in this situation psd 
but not sos forms do in fact exist. However, for the forms in 4 variables and 
degree 4 that have the special structure of gi,g-v, or g V 2, psd turns out to be 
equivalent to sos. 

The proofs of Theorems 3.8 and 3.9 are broken into the next two subsections. 
In Subsection 3.5.1, we provide the proofs for the cases where convexity and sos- 
convexity are equivalent. Then in Subsection 3.5.2, we prove that in all other 
cases there exist convex polynomials that are not sos-convex. 

■ 3.5.1 Proofs of Theorems 3.8 and 3.9: cases where T,C n ^ = C n ^, 

When proving equivalence of convexity and sos-convexity, it turns out to be more 
convenient to work with the second order characterization of sos-convexity, i.e., 
with the form g-^2(x,y) = y T H(x)y in (3.6). The reason for this is that this 
form is always quadratic in y, and this allows us to make use of the following key 
theorem, henceforth referred to as the "biform theorem" . 

Theorem 3.10 (e.g. [47]). Let f := f(ui, u 2 , v±, . . . , v m ) be a form in the variables 
u := (mi,'U2) T and v : = (vi, , . . . ,v m ) T that is a quadratic form in v for fixed u 
and a form (of however large degree) in u for fixed v. Then f is psd if and only 
if it is sos. 8 

The biform theorem has been proven independently by several authors. See [47] 
and [20] for more background on this theorem and in particular [47, Sec. 7] for a 
an elegant proof and some refinements. We now proceed with our proofs which 
will follow in a rather straightforward manner from the biform theorem. 

Theorem 3.11. £Ci id = C ljd for all d. Y1C24 = C24 for all d. 

Proof. For a univariate polynomial, convexity means that the second derivative, 
which is another univariate polynomial, is psd. Since = P\^i the second 
derivative must be sos. Therefore, SCid = C\d- To prove YjC>2,d — C24, suppose 
we have a convex bivariate form p of degree d in variables x := (a;i,X2) T . The 
Hessian H := H(x) of p is a 2 x 2 matrix whose entries are forms of degree 
d — 2. If we let y := (yi,y 2 ) T , convexity of p implies that the form y T H(x)y is 

8 Note that the results £ 2 ,d = P24 an d ^n,2 = P n ,2 are both special cases of this theorem. 
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psd. Since y T H(x)y meets the requirements of the biform theorem above with 
(ui,u 2 ) = (xi,x 2 ) and (i>i,i> 2 ) = (2/1, 2/2), it follows that y T H(x)y is sos. Hence, p 



Proof. Let x := (xi, . . . ,x n ) T and y := (y 1 , . . . ,y n ) T . Let p(x) = \x T Qx + b T x + c 
be a quadratic polynomial. The Hessian of p in this case is the constant symmetric 
matrix Q. Convexity of p implies that y T Qy is psd. But since S n 2 = P n ,2, y T Qy 
must be sos. Hence, p is sos-convex. The proof of £C ra , 2 = C n ,2 is identical. □ 

Theorem 3.13. £~C7 2 ,4 = C 2A ■ 

Proof. Let p(x) := p(xi,x 2 ) be a convex bivariate quartic polynomial. Let 
H := H(x) denote the Hessian of p and let y := (yi,y 2 ) T . Note that H(x) is 
a 2 x 2 matrix whose entries are (not necessarily homogeneous) quadratic poly- 
nomials. Since p is convex, y T H(x)y is psd. Let H(xi,x 2 ,x 3 ) be a 2 x 2 matrix 
whose entries are obtained by homogenizing the entries of H. It is easy to see 
that y T H(xi,x 2 ,x 3 )y is then the form obtained by homogenizing y T H(x)y and 
is therefore psd. Now we can employ the biform theorem (Theorem 3.10) with 
(ui,u 2 ) = (2/1,2/2) and (v 1 ,v 2 ,v 3 ) = (x 1 ,x 2 ,x 3 ) to conclude that y T H(xi,x 2 ,x 3 )y 
is sos. But upon dehomogenizing by setting x 3 = 1, we conclude that y T H(x)y is 
sos. Hence, p is sos-convex. □ 

Theorem 3.14 (Ahmadi, Blekherman, Parrilo [2]). SC 3i 4 = C 3 ^. 

Unlike Hilbert's results S 2 ,4 = -P2,4 and S34 = P 34 which are equivalent 
statements and essentially have identical proofs, the proof of SC 3i4 = C 3j4 is 
considerably more involved than the proof of EC 2 ,4 = C 2j4 . Here, we briefly point 
out why this is the case and refer the reader to [2] for more details. 

If p(x) := p(xi,x 2 ,x 3 ) is a ternary quartic form, its Hessian H(x) is a 3 x 3 
matrix whose entries are quadratic forms. In this case, we can no longer apply 
the biform theorem to the form y T H(x)y. In fact, the matrix 



is sos-convex. 



□ 



Theorem 3.12. SC nj2 = C n>2 for all n. TiC H)2 = C n>2 for all n. 



xj + 2x\ - 



X\X 2 



-XiX 3 



C(x) 



x 2 x 3 



(3.13) 



-x\x 3 -x 2 x 3 x 3 + 2x\ 



due to Choi [45] serves as an explicit example of a 3 x 3 matrix with quadratic form 
entries that is positive semidefinite but not an sos- matrix; i.e., the biquadratic 



Sec. 3.5. Characterization of the gap between convexity and sos-convexity 



67 



form y T C(x)y is psd but not sos. However, the matrix C{x) above is not a valid 
Hessian, i.e., it cannot be the matrix of the second derivatives of any polynomial. 
If this was the case, the third partial derivatives would commute. On the other 
hand, we have in particular 



A biquadratic Hessian form is a biquadratic form y T H(x)y where H(x) is the 
Hessian of some quartic form. Biquadratic Hessian forms satisfy a special sym- 
metry property. Let us call a biquadratic form b(x; y) symmetric if it satisfies 
the symmetry relation b(y;x) = b(x;y). It is an easy exercise to show that bi- 
quadratic Hessian forms satisfy y T H(x)y = x T H(y)x and are therefore symmetric 
biquadratic forms. This symmetry property is a rather strong condition that is 
not satisfied e.g. by the Choi biquadratic form y T C(x)y in (3.13). 

A simple dimension counting argument shows that the vector space of bi- 
quadratic forms, symmetric biquadratic forms, and biquadratic Hessian forms in 
variables (x±, X2, x$; y±, y2, yz) respectively have dimensions 36, 21, and 15. Since 
the symmetry requirement drops the dimension of the space of biquadratic forms 
significantly, and since sos polynomials are known to generally cover much larger 
volume in the set of psd polynomials in presence of symmetries (see e.g. [28]), 
one may initially suspect (as we did) that the equivalence between psd and sos 
ternary Hessian biquadratic forms is a consequence of the symmetry property. 
Our next theorem shows that interestingly enough this is not the case. 




= ^ -x 3 = 




dx 3 



dxi 



Theorem 3.15. There exist symmetric biquadratic forms in two sets of three 
variables that are positive semidefinite but not a sum of squares. 
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Proof. We claim that the following biquadratic form has the required properties: 
b(x u x 2: x 3 ;y u y 2: y 3 ) = tymxl + kx x x 2 y\ + 9y 1 y 3 x' 2 + 9x 1 x 3 y 2 - 10y 2 y 3 x 2 

-I0x 2 x 3 y 2 + \1y\x\ + 12y 2 x 2 + Ylx\y\ + 6y 2 xj 

+6xlyf + 23x|yiy 2 + 23y 2 x^ 2 + 13x^^3 + 13x!X 3 y 2 

+13y 2 y 3 x 2 + 13x 2 x 3 y| + 12x 2 2 y 2 + \2x\y 2 + \2y\x\ 

+^x\y x y 2 + 5y|xix 2 + V2x\y 2 3 + 3ar§j/ij/ 3 + 3yfxi:r 3 

+7x^y 2 y 3 + 7y 3 x 2 x 3 + 31y 1 y 2 x 1 x 2 - lOx^y^ 

-llxix 3 y 2 y 3 - llyit/ 3 a;2^3 + 5x ± x 2 y 2 y 3 + 5y 1 y 2 x 2 x 3 

+3x 1 x 3 y 1 y 2 + 3y 1 y 3 x 1 x 2 - 5x 2 x 3 y 2 y 3 . 

(3.14) 

The fact that y) = b(y; x) can readily be seen from the order in which we 
have written the monomials. The proof that b(x; y) is psd but not sos is given 
in [2] and omitted from here. □ 

In view of the above theorem, it is rather remarkable that all positive semidef- 
inite biquadratic Hessian forms in (xi,x 2 ,x 3 ;yi,y 2 ,y 3 ) turn out to be sums of 
squares, i.e., that £C 3j4 = C 3i4 . 

I 3.5.2 Proofs of Theorems 3.8 and 3.9: cases where SC n? rf C C n ^, 

The goal of this subsection is to establish that the cases presented in the previ- 
ous subsection are the only cases where convexity and sos-convexity are equiv- 
alent. We will first give explicit examples of convex but not sos-convex polyno- 
mials/forms that are "minimal" jointly in the degree and dimension and then 
present an argument for all dimensions and degrees higher than those of the min- 
imal cases. 

Minimal convex but not sos-convex polynomials/forms 

The minimal examples of convex but not sos-convex polynomials (resp. forms) 
turn out to belong to (7 2)6 \ SC 2 ,6 an d 6*3,4 \ £C 3j4 (resp. C 3j6 \ SC 3j6 and C 4i4 \ 
XC 4j4 ). Recall from Remark 3.5.1 that we lack a general argument for going from 
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convex but not sos-convex forms to polynomials or vice versa. Because of this, 
one would need to present four different polynomials in the sets mentioned above 
and prove that each polynomial is (i) convex and (ii) not sos-convex. This is a 
total of eight arguments to make which is quite cumbersome. However, as we 
will see in the proof of Theorem 3.16 and 3.17 below, we have been able to find 
examples that act "nicely" with respect to particular ways of dehomogenization. 
This will allow us to reduce the total number of claims we have to prove from 
eight to four. 

The polynomials that we are about to present next have been found with the 
assistance of a computer and by employing some "tricks" with semidefmite pro- 
gramming similar to those presented in Appendix A. 9 In this process, we have 
made use of software packages YALMIP [98], SOSTOOLS [132], and the SDP 
solver SeDuMi [157], which we acknowledge here. To make the chapter relatively 
self-contained and to emphasize the fact that using rational sum of squares cer- 
tificates one can make such computer assisted proofs fully formal, we present the 
proof of Theorem 3.16 below in the Appendix B. On the other hand, the proof 
of Theorem 3.17, which is very similar in style to the proof of Theorem 3.16, is 
largely omitted to save space. All of the proofs are available in electronic form 
and in their entirety at http://aaa.lids.mit.edu/software. 

Theorem 3.16. SC 2i6 is a proper subset of C 2 fi- SC 3)6 is a proper subset o/C 3j6 . 

Proof. We claim that the form 

f(xi,x 2 , x 3 ) = llx\ — 155xfx 2 + 445xfxl + 76x^X2 + 556x^2 + 68x1X2 

+240x1 - 9xfx 3 - 1 129x^X3 + 62xfx^x 3 + 1206xix|x 3 

-343x^X3 + 3Q3xfxl + 773x?x 2 x 3 ; + 891x?x|x^ - 869x^x1 

+ 1043x^x2 - \Ax\x\ - 1108x?x 2 x| - 216xixlx^ - 839x|x| 

+721xfx| + 436x1X2X3* + 378x|x^ + 48x x x^ - 97x 2 x| + 89x 6 3 

(3.15) 

belongs to C^q \ SC^, and the polynomial 10 

/(xi,x 2 ) = /(xi,x 2 ,l - ^x 2 ) (3.16) 

9 The approach of Appendix A, however, does not lead to examples that are minimal. But 
the idea is similar. 

10 The polynomial /(xi, £2, 1) turns out to be sos-convex, and therefore does not do the 
job. One can of course change coordinates, and then in the new coordinates perform the 
dehomogenization by setting x 3 = 1. 
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belongs to C 2 $ \ SC 2j 6- Note that since convexity and sos-convexity are both 
preserved under restrictions to afhne subspaces (recall Remark 3.3.2), it suffices 
to show that the form / in (3.15) is convex and the polynomial / in (3.16) is not 
sos-convex. Let x := (xi,x 2 ,x 2 ) T , y ■= (yi, y 2 , yz) T , x := (x 1 ,x 2 ) T , y ■= (yi,y 2 ) T , 
and denote the Hessian of / and / respectively by Hj and Hj. In Appendix B, 
we provide rational Gram matrices which prove that the form 

(xl + xl)-y T H f (x)y (3.17) 

is sos. This, together with nonnegativity of x\ + x\ and continuity of y T Hf(x)y, 
implies that y T Hf(x)y is psd. Therefore, / is convex. The proof that / is not 
sos-convex proceeds by showing that Hj is not an sos-matrix via a separation 
argument. In Appendix B, we present a separating hyperplane that leaves the 
appropriate sos cone on one side and the polynomial 

y T H f (x)y (3.18) 

on the other. □ 

Theorem 3.17. XC 3j4 is a proper subset ofC^. SC 4j 4 is a proper subset of C^. 

Proof. We claim that the form 

h(x 1: . . . , x 4 ) = 1671x? - 4134x?x 2 - 3332x?x 3 + 5104x?x^ + 4989x?x 2 x 3 

+3490x?x| - 2203xixl - SOSOxxx^x-j - m§x x x 2 x\ 

-1522a;ia;| + 1227^ - 595xfx 3 + \^x\x\ + \\A&x 2 x\ 

+ 1195728a;| - 1932xia~j - 2296x 2 xi - 3144x 3 xl + \^hx\x\ 

-1376x?x 4 - 263x!X 2 xl + 2790x?x 2 x 4 + 1Yl\x\x\ + 979x^ 

-292xix^4 - 1224x|x4 + 2404xix 3 xl + 2727x 2 x 3 x^ 

-2852x!x|x 4 - 388x 2 x|x 4 - 1520x^x 4 + 2943x?x 3 x 4 

— 5053xiX 2 x 3 x 4 + 2552x|x 3 x 4 + 3512x3xf 

(3.19) 

belongs to C 4j4 \ SC 4i4 , and the polynomial 

^(xi,x 2 ,x 3 ) = /i(xi,x 2 ,x 3 , 1) (3.20) 



Sec. 3.5. Characterization of the gap between convexity and sos-convexity 



71 



belongs to \ EC 3i4 . Once again, it suffices to prove that h is convex and h 
is not sos-convex. Let x := (x±, x 2 , x 3 , x 4 ) T , y := (2/1, 2/2,2/3, 2/4)^, and denote the 
Hessian of h and h respectively by Hh and H~ h . The proof that h is convex is done 
by showing that the form 

(xl + x 2 3 + xl)-y T H h (x)y (3.21) 

is sos. 11 The proof that h is not sos-convex is done again by means of a separating 
hyperplane. □ 



Convex but not sos-convex polynomials/forms in all higher degrees and dimensions 

Given a convex but not sos-convex polynomial (form) in n variables , it is very 
easy to argue that such a polynomial (form) must also exist in a larger number 
of variables. If p(xi, . . . ,x n ) is a form in C n ^ \ SC ni d, then 

p(x u x n+l ) = p(x u ...,x n ) + x d n+l 

belongs to C n+ \^ \ HC n+ i t d- Convexity of p is obvious since it is a sum of convex 
functions. The fact that p is not sos-convex can also easily be seen from the block 
diagonal structure of the Hessian of p: if the Hessian of p were to factor, it would 
imply that the Hessian of p should also factor. The argument for going from 
C nA \ T,C ntd to C n+ltd \ £C n+M is identical. 

Unfortunately, an argument for increasing the degree of convex but not sos- 
convex forms seems to be significantly more difficult to obtain. In fact, we have 
been unable to come up with a natural operation that would produce a from in 
C n ,d+2 \ 2Cn,d+2 from a form in C„ i rf\SC ni d- We will instead take a different route: 
we are going to present a general procedure for going from a form in P nA \ S raj d to 
a form in C n ^ + 2 \ YiC nt d + 2- This will serve our purpose of constructing convex but 
not sos-convex forms in higher degrees and is perhaps also of independent interest 
in itself. For instance, it can be used to construct convex but not sos-convex forms 
that inherit structural properties (e.g. symmetry) of the known examples of psd 
but not sos forms. The procedure is constructive modulo the value of two positive 
constants (7 and a below) whose existence will be shown nonconstructively. 

Although the proof of the general case is no different, we present this con- 
struction for the case n = 3. The reason is that it suffices for us to construct 
forms in C 3i d\SC3 i( i for d even and > 8. These forms together with the two forms 
in C^Xl^Csfi and 64,4 ^6*4,4 presented in (3.15) and (3.19), and with the simple 
procedure for increasing the number of variables cover all the values of n and d 
for which convex but not sos-convex forms exist. 

11 The choice of multipliers in (3.17) and (3.21) is motivated by a result of Reznick in [137] 
explained in Appendix A. 
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For the remainder of this section, let x := (xi,X2,x 3 ) T and y := (y 1 , y 2 , y^) T ■ 

Theorem 3.18. Let m := m(x) be a ternary form of degree d (with d necessarily 
even and > 6) satisfying the following three requirements: 

Rl: m is positive definite. 

R2: m is not a sum of squares. 

R3: The Hessian H m of m is positive definite at the point (1,0, 0) T . 

Let g := g(x2, £3) be any bivariate form of degree d + 2 whose Hessian is positive 
definite. 

Then, there exists a constant 7 > 0, such that the form f of degree d + 2 given by 



is convex but not sos- convex. 

Before we prove this theorem, let us comment on how one can get examples 
of forms m and g that satisfy the requirements of the theorem. The choice of g 
is in fact very easy. We can e.g. take 



which has a positive definite Hessian. As for the choice of m, essentially any psd 
but not sos ternary form can be turned into a form that satisfies requirements 
Rl, R2, and R3. Indeed if the Hessian of such a form is positive definite at just 
one point, then that point can be taken to (1,0, 0) T by a change of coordinates 
without changing the properties of being psd and not sos. If the form is not 
positive definite, then it can made so by adding a small enough multiple of a 
positive definite form to it. For concreteness, we construct in the next lemma a 
family of forms that together with the above theorem will give us convex but not 
sos-convex ternary forms of any degree > 8. 

Lemma 3.19. For any even degree d > 6, there exists a constant a > 0, such 
that the form 




(3.22) 



g(x 2 ,x 3 ) = (x\ + x\) 



d+2 
2 



m(x) = x\ % (x\x\ + x\x\ — ?>x\x\x\ + xl) + a{x\ + x\ + x\) 2 



(3.23) 



satisfies the requirements Rl, R2, and R3 of Theorem 3.18. 
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Proof. The form 



2 4, 4 2 o 2 2 2 , 6 
12 I 12 o .x 2 2 3 3 



is the familiar Motzkin form in (3.1) that is psd but not sos [107]. For any even 
degree d > 6, the form 

d— 6/ 24, 42_q222, 6\ 

is a form of degree d that is clearly still psd and less obviously still not sos; 
see [138]. This together with the fact that S n rf is a closed cone implies existence 
of a small positive value of a for which the form m in (3.23) is positive definite 
but not a sum of squares, hence satisfying requirements Rl and R2. 

Our next claim is that for any positive value of a, the Hessian H m of the form 
m in (3.23) satisfies 

"ci 0" 

if ro (l,0,0) 



c 2 
c 3 



(3.24) 



for some positive constants c±, 02,03, therefore also passing requirement R3. To 
see the above equality, first note that since m is a form of degree d, its Hessian H m 
will have entries that are forms of degree d — 2. Therefore, the only monomials 
that can survive in this Hessian after setting x 2 and x 3 to zero are multiples of 
x^~ 2 . It is easy to see that an x±~ 2 monomial in an off-diagonal entry of H m 
would lead to a monomial in m that is not even. On the other hand, the form m 
in (3.23) only has even monomials. This explains why the off-diagonal entries of 
the right hand side of (3.24) are zero. Finally, we note that for any positive value 
of a, the form m in (3.23) includes positive multiples of 2, and xf 2 x\, 

which lead to positive multiples of x^ 2 on the diagonal of H m . Hence, C\, c 2 , and 
c 3 are positive. □ 

Next, we state a lemma that will be employed in the proof of Theorem 3.18. 

Lemma 3.20. Let m be a trivariate form satisfying the requirements Rl and R3 
of Theorem 3.18. Let denote the Hessian of the form J* 1 m(t,X2,x 3 )dtds. 
Then, there exists a positive constant 5, such that 

y T H^(x)y > 

on the set 

S := {{x,y) I \\x\\ = l,\\y\\ = 1, {x\ + x\ < 5 or y\ + y\ < 5)}. (3.25) 
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Proof. We observe that when y\ + y\ = 0, we have 

y T H lh (x)y = yfm(x), 

which by requirement Rl is positive when = ||y|| = 1. By continuity of the 
form y T H^(x)y, we conclude that there exists a small positive constant S y such 
that y T Hrh{x)y > on the set 

S y ■■= {(x,y) | \\x\\ = l,\\y\\ = 1, y 2 2 + y\< 5 y }. 

Next, we leave it to the reader to check that 

#^(1,0,0) = — l —.H m {l, 0,0). 

Therefore, when x\ + x\ = 0, requirement R3 implies that y T Hrh(x)y is positive 
when = ||y|| = 1. Appealing to continuity again, we conclude that there 
exists a small positive constant S x such that y T Hm(x)y > on the set 

S x ■= {(x,y) | \\x\\ = 1, = 1, x\ + x\ < 5 X }. 

If we now take 5 = minj^, S x }, the lemma is established. □ 

We are now ready to prove Theorem 3.18. 

Proof of Theorem 3.18. We first prove that the form / in (3.22) is not sos-convex. 
By Lemma 3.7, if / was sos-convex, then all diagonal elements of its Hessian would 
have to be sos polynomials. On the other hand, we have from (3.22) that 

df(x) 

m(x), 



dx\dxi 



which by requirement R2 is not sos. Therefore / is not sos-convex. 

It remains to show that there exists a positive value of 7 for which / becomes 
convex. Let us denote the Hessians of /, J* 1 J* S m(t, X2, x^dtds, and g, by Hf, 
Hm, and H g respectively. So, we have 

H f (x) = H^{x) +<yH g (x 2 ,x 3 ). 

(Here, H g is a 3 x 3 matrix whose first row and column are zeros.) Convexity of 
/ is of course equivalent to nonnegativity of the form y T Hf(x)y. Since this form 
is bi-homogeneous in x and y, it is nonnegative if and only if y T Hf(x)y > on 
the bi-sphere 

B:={(x,y) I |N| = 1,|H| = 1}. 
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Let us decompose the bi-sphere as 

B = SUS, 

where S is defined in (3.25) and 

S:={(x,y) | \\x\\ = l,\\y\\ = l,x 2 2 + x 2 3 >5,y 2 2 +y 2 3 >5}. 

Lemma 3.20 together with positive definiteness of H g imply that y Hf(x)y is 
positive on S. As for the set S, let 

ft = mm_y T H 7h (x)y, 

x,y,eS 



and 



3 2 = min_y T H g (x 2 ,x 3 )y. 

x,y,eS 



By the assumption of positive definiteness of H g , we have (3 2 > 0. If we now let 



7 > 



1/5 



then 

min y T H f (x)y> p l + ^p 2 >0. 

x,y,eS P2 

Hence y T Hf(x)y is nonnegative (in fact positive) everywhere on B and the proof 
is completed. □ 

Finally, we provide an argument for existence of bivariate polynomials of de- 
gree 8, 10, 12, . . . that are convex but not sos-convex. 

Corollary 3.21. Consider the form f in (3.22) constructed as described in The- 
orem 3.18. Let 

f(x 1 ,x 2 ) = f(x u x 2 , 1). 
Then, f is convex but not sos-convex. 

Proof. The polynomial / is convex because it is the restriction of a convex func- 
tion. It is not difficult to see that 

df(x l ,x 2 ) 

— — = m(x 1 ,x 2 , 1), 

OX\OXi 

which is not sos. Therefore from Lemma 3.7 / is not sos-convex. □ 

Corollary 3.21 together with the two polynomials in C 2 fi\EC 2 £ and C3 j 4\SC , 3 i 4 
presented in (3.16) and (3.20), and with the simple procedure for increasing the 
number of variables described at the beginning of Subsection 3.5.2 cover all the 
values of n and d for which convex but not sos-convex polynomials exist. 
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H 3.6 Concluding remarks and an open problem 
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Figure 3.1. The tables answer whether every convex polynomial (form) in n variables and of 
degree d is sos-convex. 

A summary of the results of this chapter is given in Figure 3.1. To conclude, 
we would like to point out some similarities between nonnegativity and convexity 
that deserve attention: (i) both nonnegativity and convexity are properties that 
only hold for even degree polynomials, (ii) for quadratic forms, nonnegativity is 
in fact equivalent to convexity, (iii) both notions are NP-hard to check exactly 
for degree 4 and larger, and most strikingly (iv) nonnegativity is equivalent to 
sum of squares exactly in dimensions and degrees where convexity is equivalent 
to sos- convexity. It is unclear to us whether there can be a deeper and more 
unifying reason explaining these observations, in particular, the last one which 
was the main result of this chapter. 

Another intriguing question is to investigate whether one can give a direct 
argument proving the fact that T,C nt d = C n ^ if and only if EC^+i^ = C n+ i j( j. 
This would eliminate the need for studying polynomials and forms separately, 
and in particular would provide a short proof of the result S 3 4 = C 3i 4 given in [2]. 

Finally, an open problem related to the work in this chapter is to find an 
explicit example of a convex form that is not a sum of squares. Blekherman [26] 
has shown via volume arguments that for degree d > 4 and asymptotically for 
large n such forms must exist, although no examples are known. In particular, it 
would interesting to determine the smallest value of n for which such a form exists. 
We know from Lemma 3.6 that a convex form that is not sos must necessarily 
be not sos-convex. Although our several constructions of convex but not sos- 
convex polynomials pass this necessary condition, the polynomials themselves are 
all sos. The question is particularly interesting from an optimization viewpoint 
because it implies that the well-known sum of squares relaxation for minimizing 
polynomials [155], [124] may not be exact even for the easy case of minimizing 
convex polynomials. 
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I 3.7 Appendix A: How the first convex but not sos-convex polynomial was 
found 

In this appendix, we explain how the polynomial in (3.11) was found by solving a 
carefully designed sos-program 12 . The simple methodology described here allows 
one to search over a restricted family of nonnegative polynomials that are not sums 
of squares. The procedure can potentially be useful in many different settings and 
this is our main motivation for presenting this appendix. 

Our goal is to find a polynomial p := p(x) whose Hessian H := H(x) satisfies: 

y T H(x)y psd but not sos. (3.26) 

Unfortunately, a constraint of type (3.26) that requires a polynomial to be psd 
but not sos is a non-convex constraint and cannot be easily handled with sos- 
programming. This is easy to see from a geometric viewpoint. The feasible set 
of an sos-program, being a semi definite program, is always a convex set. On the 
other hand, for a fixed degree and dimension, the set of psd polynomials that are 
not sos is non-convex. Nevertheless, we describe a technique that allows one to 
search over a convex subset of the set of psd but not sos polynomials using sos- 
programming. Our strategy can simply be described as follows: (i) Impose the 
constraint that the polynomial should not be sos by using a separating hyperplane 
(dual functional) for the sos cone, (ii) Impose the constraint that the polynomial 
should be psd by requiring that the polynomial times a nonnegative multiplier is 

SOS. 

By definition, the dual cone S* d of the sum of squares cone S n ^ is the set of 
all linear functionals fi that take nonnegative values on it, i.e, 

K, d ■= e K,* (M>o VpeE M }. 

Here, the dual space T-L* n d denotes the space of all linear functionals on the space 
T-L n ,d of forms in n variables and degree d, and (., .) represents the pairing between 
elements of the primal and the dual space. If a form is not sos, we can find a 
dual functional ji G S* d that separates it from the closed convex cone S n d . The 
basic idea behind this is the well known separating hyperplane theorem in convex 
analysis; see e.g. [38,142]. 

As for step (ii) of our strategy above, our approach for guaranteeing that of 
a form g is nonnegative will be to require g(x) ■ (X^ 2 -?^ be sos f° r some integer 
r > 1. Our choice of the multiplier (^\ xff as opposed to any other psd multiplier 
is motivated by a result of Reznick [137] on Hilbert's 17th problem. The 17th 

12 The term "sos-program" is usually used to refer to semidefinite programs that have sum of 
squares constraints. 
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problem, which was answered in the affirmative by Artin [19], asks whether every 
psd form must be a sum of squares of rational functions. The affirmative answer 
to this question implies that if a form g is psd, then there must exist an sos form 
s, such that g • s is sos. Reznick showed in [137] that if g is positive definite, one 
can always take s(x) = (X]j x ?) r > f° r sufficiently large r. For all polynomials that 
we needed prove psd in this chapter, taking r = 1 has been good enough. 

For our particular purpose of finding a convex but not sos-convex polynomial, 
we apply the strategy outlined above to make the first diagonal element of the 
Hessian psd but not sos (recall Lemma 3.7). More concretely, the polynomial in 
(3.11) was derived from a feasible solution to the following sos-program: 

— Parameterize p G li^s and compute its Hessian H = 

— Impose the constraints 

(x 2 1 + x 2 2 + xl)-y T H(x)y sos, (3.27) 

(ji,H ltl ) = -l (3.28) 
(for some dual functional ji G S^g). 

The decision variables of this sos-program are the coefficients of the polynomial 
p that also appear in the entries of the Hessian matrix H. (The polynomial H ltl 
in (3.28) denotes the first diagonal element of H.) The dual functional \x must be 
fixed a priori as explained in the sequel. Note that all the constraints are linear in 
the decision variables and indeed the feasible set described by these constraints is 
a convex set. Moreover, the reader should be convinced by now that if the above 
sos-program is feasible, then the solution p is a convex polynomial that is not 
sos-convex. 

The reason why we chose to parameterize p as a form in is that a minimal 
case where a diagonal element of the Hessian (which has 2 fewer degree) can be 
psd but not sos is among the forms in T-Lz$. The role of the dual functional 
/i G S3 6 in (3.28) is to separate the polynomial H ltl from S 3i6 . Once an ordering 
on the monomials of H ltl is fixed, this constraint can be imposed numerically as 

{ l ji,H 1>1 ) = b T H 1 , 1 = -l, (3.29) 

where denotes the vector of coefficients of the polynomial H\ \ and b G R 28 
represents our separating hyperplane, which must be computed prior to solving 
the above sos-program. 

There are several ways to obtain a separating hyperplane for E 3j6 . Our ap- 
proach was to find a hyperplane that separates the Motzkin form M in (3.1) from 
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S 3 6 . This can be done in at least a couple of different ways. For example, we 
can formulate a semidefinite program that requires the Motzkin form to be sos. 
This program is clearly infeasible. Any feasible solution to its dual semidefinite 
program will give us the desired separating hyperplane. Alternatively, we can 
set up an sos-program that finds the Euclidean projection M p := M p (x) of the 
Motzkin form M onto the cone S 3 6 . Since the projection is done onto a convex 
set, the hyperplane tangent to S 36 at M p will be supporting S 3 6 , and can serve 
as our separating hyperplane. 

To conclude, we remark that in contrast to previous techniques of constructing 
examples of psd but not sos polynomials that are usually based on some obstruc- 
tions associated with the number of zeros of polynomials (see e.g. [138]), our 
approach has the advantage that the resulting polynomials are positive definite. 
Furthermore, additional linear or semidefinite constraints can easily be incorpo- 
rated in the search process to impose e.g. various symmetry or sparsity patterns 
on the polynomial of interest. 
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I 3.8 Appendix B: Certificates complementing the proof of Theorem 3.16 

Let^x := (x u x 2 ,x 2 ) T , y : = (y 1 ,y 2 ,y 3 ) T , x := (x 1 ,x 2 ) T , y ■= (2/i,2/ 2 ) T , and let 
/, /, Hf, and Hj be as in the proof of Theorem 3.16. This appendix proves that 
the form {x\ + x 2 ) ■ y T Hf(x)y in (3.17) is sos and that the polynomial y 1 H^{x)y 

in (3.18) is not sos, hence proving respectively that / is convex and / is not 
sos-convex. 

A rational sos decomposition of {x\ + x 2 ) ■ y T Hf(x)y, which is a form in 6 
variables of degree 8, is as follows: 



where z is the vector of monomials 

z = [x 2 xly 3 , x 2 x\y 2 , x 2 x\y x , xlx 3 y 3 , x 2 2 x 3 y 2 , x\x 3 y x , x\y 3 , x\y 2 , x\y x , 

xix\y 3) xixly 2) Xxxly^x^x^x^x^ x 1 x 2 x 3 y 1 ,x 1 xly 3 , x 1 x%y 2 , x x x\y u 
xlx 3 y 3 ,xlx 3 y 2 , x\x 3 y u x\x 2 y 3 , xjx 2 y 2 , x\x 2 y u x\y 3 , xfy 2 , x\yi] T , 
and Q is the 27 x 27 positive definite matrix 13 presented on the next page 



13 Whenever we state a matrix is positive definite, this claim is backed up by 
a rational LDL T factorization of the matrix that the reader can find online at 
http : //aaa . lids .mit . edu/sof tware. 



(xj + x 2 2 )-y T H f (x)y 




Q = [Qi Q 2 ] , 
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Next, we prove that the polynomial y T Hj(x)y in (3.18) is not sos. Let us first 
present this polynomial and give it a name: 

t(x, y) := y T Hj(i)y = 294xix 2 2/I - 6995a&/iJ/2 - 10200xij/iy 2 - 4356x 2 x 2 y 2 

-2904x?yiy 2 - 11475xix 2 y 2 + 13680a^/ij/ 2 + 4764xix 2 y 2 
+A7QAx\y 1 y 2 + 6429x 2 x^y 2 + 294x|j/ij/ 2 - 13990xix|y| 
-12123x 2 x 2 y 2 - 3872x 2 2/it/2 + ^^2/1 + 20520xiX 2 2/| 
+29076xix 2 2/i2/2 - 2A2A§x 1 x 2 2 y 1 y 2 + 14901xi^yiy 2 
+ 15039x 2 x 2 2/i2/ 2 + 8572x?x 2 yiy 2 + ^3x 2 x 2 2/f + 1442y 2 
-12360x 2 ?/ 2 _ 5100x 2 y 2 + ^f^x 2 ^ + 72§§x\y\ 
+ 7J W4y 2 2 + 1J T4y! - 1936xi2/ 2 - 84xiy 2 + ^y 2 
+7269x 2 y 2 + 4356x 2 y 2 - 3825x??/ 2 - 180x?y 2 + 632j/ij/ 2 
+2310x^ 2 + SOlSxix^ 2 - 22 9 50x 2 x 2 2/i2/2 - 45025x^ 2 

-1505x^1^/2 - A0Alx 3 2 yf - 3010x?x 2 2/ 2 + 5013x?x 2 y|. 

Note that t is a polynomial in 4 variables of degree 6 that is quadratic in y. Let 
us denote the cone of sos polynomials in 4 variables (x,y) that have degree 6 
and are quadratic in y by S^, and its dual cone by E^g. Our proof will simply 
proceed by presenting a dual functional £ G X^g that takes a negative value on 
the polynomial t. We fix the following ordering of monomials in what follows: 

v = [y 2 , 2/i2/2, 2/?, x 2 y%, x 2 ym, x 2 yf, x\y\, x\y x y 2 , x\y\, x\y\, x\y x y 2 , x\y\, x\y\, 
xtym, x\y\, x iyh x iym, x x y\, Xix 2 y%, x x x 2 y x y 2 , x x x 2 y\, x x x\y\, x x x\y x y 2 , 

22 32 3 32222 22222 22 

x i x 2 yii x i x 2 y 2 ) x i x 2 yiy2)XiX 2 y l) x l y 2) x 1 yiy 2) x 1 y l ,x 1 x 2 y 2 -,x 1 x 2 yiy 2 , x 1 x 2 y 1: 
x l x lyh x \ x \y\y2,x\x\y\,x\y\, x\yiy 2 , x\y\, x\x 2 y\, x\x 2 y x y 2 , x\x 2 y\, x\y\, 
xfym, xfyff. 

(3.30) 

Let t represent the vector of coefficients of t ordered according to the list of 
monomials above; i.e., t = Irv. Using the same ordering, we can represent our 
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dual functional £ with the vector 

c = [19338, -2485, 17155, 6219, -4461, 11202, 4290, -5745, 13748, 3304, -5404, 
13227, 3594, -4776, 19284, 2060, 3506, 5116, 366, -2698, 6231, -487, -2324, 
4607, 369, -3657, 3534, 6122, 659, 7057, 1646, 1238, 1752, 2797, -940, 4608, 
-200, 1577, -2030, -513, -3747, 2541, 15261, 220, 7834] T . 

We have 

_ v T ^ 364547 „ 
(£,t) = c T t = ^<0- 

On the other hand, we claim that £ G £45; i.e., for any form w G £4,6, we should 
have 

= c T w>0, (3.31) 

where w here denotes the coefficients of w listed according to the ordering in 
(3.30). Indeed, if w is sos, then it can be written in the form 

w(x) = z Qz = Tr Q ■ zz , 

for some symmetric positive semidefinite matrix Q, and a vector of monomials 

z = [2/2, Vi, x 2 y 2 , x 2 yi,x 1 y 2 , x ± y 1: x\y 2 , x 2 2 y 1 ,x 1 x 2 y 2 , x x x 2 yi, x\y 2 , xjyxf. 

It is not difficult to see that 

c T w — TrQ- (zz T )\ c , (3.32) 

where by (zz T )\ c we mean a matrix where each monomial in zz T is replaced with 
the corresponding element of the vector c. This yields the matrix 



- 19338 


-2485 


6219 


-4461 


2060 


3506 


4290 


-5745 


366 


-2698 


6122 


659" 


-2485 


17155 


-4461 


11202 


3506 


5116 


-5745 


13748 


-2698 


6231 


659 


7057 


6219 


-4461 


4290 


-5745 


366 


-2698 


3304 


-5404 


-487 


-2324 


1646 


1238 


-4461 


11202 


-5745 


13748 


-2698 


6231 


-5404 


13227 


-2324 


4607 


1238 


1752 


2060 


3506 


366 


-2698 


6122 


659 


-487 


-2324 


1646 


1238 


-200 


1577 


3506 


5116 


-2698 


6231 


659 


7057 


-2324 


4607 


1238 


1752 


1577 


-2030 


4290 


-5745 


3304 


-5404 


-487 


-2324 


3594 


-4776 


369 


-3657 


2797 


-940 


-5745 


13748 


-5404 


13227 


-2324 


4607 


-4776 


19284 


-3657 


3534 


-940 


4608 


366 


-2698 


-487 


-2324 


1646 


1238 


369 


-3657 


2797 


-940 


-513 


-3747 


-2698 


6231 


-2324 


4607 


1238 


1752 


-3657 


3534 


-940 


4608 


-3747 


2541 


6122 


659 


1646 


1238 


-200 


1577 


2797 


-940 


-513 


-3747 


15261 


220 


. 659 


7057 


1238 


1752 


1577 


-2030 


-940 


4608 


-3747 


2541 


220 


7834. 



which is positive definite. Therefore, equation (3.32) along with the fact that Q 
is positive semidefinite implies that (3.31) holds. This completes the proof. 



Part II: 

Lyapunov Analysis and Computation 



Chapter 4 



Lyapunov Analysis of Polynomial 

Differential Equations 



In the last two chapters of this thesis, our focus will turn to Lyapunov analysis 
of dynamical systems. The current chapter presents new results on Lyapunov 
analysis of polynomial vector fields. The content here is based on the works 
in [10] and [4], as well as some more recent results. 

I 4.1 Introduction 

We will be concerned for the most part of this chapter with a continuous time 
dynamical system 

x = f(x), (4.1) 

where / : W n — > W 1 is a polynomial and has an equilibrium at the origin, i.e., 
/(0) = 0. Arguably, the class of polynomial differential equations are among 
the most widely encountered in engineering and sciences. For stability analysis 
of these systems, it is most common (and quite natural) to search for Lyapunov 
functions that are polynomials themselves. When such a candidate Lyapunov 
function is used, then conditions of Lyapunov's theorem reduce to a set of poly- 
nomial inequalities. For instance, if establishing global asymptotic stability of the 
origin is desired, one would require a radially unbounded polynomial Lyapunov 
candidate V(x) : M. n — > R to vanish at the origin and satisfy 

V{x) > Vx^O (4.2) 
V(x) = (VV(x)J(x)} < Vx^O. (4.3) 

Here, V denotes the time derivative of V along the trajectories of (4.1), VV(i) is 
the gradient vector of V, and (., .) is the standard inner product in IR n . In some 
other variants of the analysis problem, e.g. if LaSalle's invariance principle is to 
be used, or if the goal is to prove boundedness of trajectories of (4.1), then the 
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inequality in (4.3) is replaced with 



V(x) < Vs. 



(4.4) 



In any case, the problem arising from this analysis approach is that even though 
polynomials of a given degree are finitely parameterized, the computational prob- 
lem of searching for a polynomial V satisfying inequalities of the type (4.2), (4.3), 
(4.4) is intractable. An approach pioneered in [118] and widely popular by now is 
to replace the positivity (or nonnegativity) conditions by the requirement of the 
existence of a sum of squares (sos) decomposition: 



As we saw in the previous chapter, sum of squares decomposition is a suffi- 
cient condition for polynomial nonnegativity that can be efficiently checked with 
semidefinite programming. For a fixed degree of a polynomial Lyapunov candi- 
date V, the search for the coefficients of V subject to the constraints (4.5) and 
(4.6) is a semidefinite program (SDP). We call a Lyapunov function satisfying 
both sos conditions in (4.5) and (4.6) a sum of squares Lyapunov function. We 
emphasize that this is the sensible definition of a sum of squares Lyapunov func- 
tion and not what the name may suggest, which is a Lyapunov function that is a 
sum of squares. Indeed, the underlying semidefinite program will find a Lyapunov 
function V if and only if V satisfies both conditions (4.5) and (4.6). 

Over the last decade, the applicability of sum of squares Lyapunov functions 
has been explored and extended in many directions and a multitude of sos tech- 
niques have been developed to tackle a range of problems in systems and control. 
We refer the reader to the by no means exhaustive list of works [76], [41], [43], [83], 
[131], [114], [133], [42], [6], [21], [159] and references therein. Despite the wealth 
of research in this area, the converse question of whether the existence of a poly- 
nomial Lyapunov function implies the existence of a sum of squares Lyapunov 
function has remained elusive. This question naturally comes in two variants: 

Problem 1: Does existence of a polynomial Lyapunov function of a given 
degree imply existence of a polynomial Lyapunov function of the same degree 
that satisfies the sos conditions in (4.5) and (4.6)? 

Problem 2: Does existence of a polynomial Lyapunov function of a given 
degree imply existence of a polynomial Lyapunov function of possibly higher degree 
that satisfies the sos conditions in (4.5) and (4.6)? 

The notion of stability of interest in this chapter, for which we will study the 
questions above, is global asymptotic stability (GAS); see e.g. [86, Chap. 4] for 



-V = -(VV,f) sos. 



V sos 



(4.5) 
(4.6) 
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a precise definition. Of course, a fundamental question that comes before the 
problems mentioned above is the following: 

Problem 0: If a polynomial dynamical system is globally asymptotically 
stable, does it admit a polynomial Lyapunov function? 

I 4.1.1 Contributions and organization of this chapter 

In this chapter, we give explicit counterexamples that answer Problem and 
Problem 1 in the negative. This is done in Section 4.3 and Subsection 4.4.2 
respectively. On the other hand, in Subsection 4.4.3, we give a positive answer to 
Problem 2 for the case where the vector field is homogeneous (Theorem 4.8) or 
when it is planar and an additional mild assumption is met (Theorem 4.10). The 
proofs of these two theorems are quite simple and rely on powerful Positivstellen- 
satz results due to Scheiderer (Theorems 4.7 and 4.9). In Section 4.5, we extend 
these results to derive a converse sos Lyapunov theorem for robust stability of 
switched linear systems. It will be proven that if such a system is stable under 
arbitrary switching, then it admits a common polynomial Lyapunov function that 
is sos and that the negative of its derivative is also sos (Theorem 4.11). We also 
show that for switched linear systems (both in discrete and continuous time), if 
the inequality on the decrease condition of a Lyapunov function is satisfied as 
a sum of squares, then the Lyapunov function itself is automatically a sum of 
squares (Propositions 4.14 and 4.15). We list a number of related open problems 
in Section 4.6. 

Before these contributions are presented, we establish a hardness result for 
the problem of deciding asymptotic stability of cubic homogeneous vector fields 
in the next section. We also present some byproducts of this result, including a 
Lyapunov-inspired technique for proving positivity of forms. 

I 4.2 Complexity considerations for deciding stability of polynomial vector 
fields 

It is natural to ask whether stability of equilibrium points of polynomial vector 
fields can be decided in finite time. In fact, this is a well-known question of Arnold 
that appears in [17]: 

"Is the stability problem for stationary points algorithmically decidable? The 
well-known Lyapounov theorem 1 solves the problem in the absence of eigen- 
values with zero real parts. In more complicated cases, where the stability 

lr The theorem that Arnold is referring to here is the indirect method of Lyapunov related to 
linearization. This is not to be confused with Lyapunov's direct method (or the second method), 
which is what we are concerned with in sections that follow. 
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depends on higher order terms in the Taylor series, there exists no algebraic 
criterion. 

Let a vector field be given by polynomials of a fixed degree, with rational co- 
efficients. Does an algorithm exist, allowing to decide, whether the stationary 
point is stable?" 

Later in [51], the question of Arnold is quoted with more detail: 

"In my problem the coefficients of the polynomials of known degree and 
of a known number of variables are written on the tape of the standard 
Turing machine in the standard order and in the standard representation. 
The problem is whether there exists an algorithm (an additional text for the 
machine independent of the values of the coefficients) such that it solves the 
stability problem for the stationary point at the origin (i.e., always stops 
giving the answer "stable" or "unstable"). 

I hope, this algorithm exists if the degree is one. It also exists when the 
dimension is one. My conjecture has always been that there is no algorithm 
for some sufficiently high degree and dimension, perhaps for dimension 3 and 
degree 3 or even 2. I am less certain about what happens in dimension 2. Of 
course the nonexistence of a general algorithm for a fixed dimension working 
for arbitrary degree or for a fixed degree working for an arbitrary dimension, 
or working for all polynomials with arbitrary degree and dimension would 
also be interesting." 

To our knowledge, there has been no formal resolution to these questions, 
neither for the case of stability in the sense of Lyapunov, nor for the case of 
asymptotic stability (in its local or global version). In [51], da Costa and Doria 
show that if the right hand side of the differential equation contains elementary 
functions (sines, cosines, exponentials, absolute value function, etc.), then there 
is no algorithm for deciding whether the origin is stable or unstable. They also 
present a dynamical system in [52] where one cannot decide whether a Hopf bi- 
furcation will occur or whether there will be parameter values such that a stable 
fixed point becomes unstable. In earlier work, Arnold himself demonstrates some 
of the difficulties that arise in stability analysis of polynomial systems by pre- 
senting a parametric polynomial system in 3 variables and degree 5, where the 
boundary between stability and instability in parameter space is not a semialge- 
braic set [16]. A relatively larger number of undecidability results are available for 
questions related to other properties of polynomial vector fields, such as reach- 
ability [73] or boundedness of domain of definition [65], or for questions about 
stability of hybrid systems [30], [35], [34], [29]. We refer the interested reader to 
the survey papers in [37], [73], [156], [33], [36]. 
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We are also interested to know whether the answer to the undecidability ques- 
tion for asymptotic stability changes if the dynamics is restricted to be homoge- 
neous. A polynomial vector field x = f(x) is homogeneous if all entries of / are 
homogeneous polynomials of the same degree. Homogeneous systems are exten- 
sively studied in the literature on nonlinear control [149], [14], [68], [23], [74], [146], 
[106], and some of the results of this chapter (both negative and positive) are de- 
rived specifically for this class of systems. A basic fact about homogeneous vector 
fields is that for these systems the notions of local and global stability are equiva- 
lent. Indeed, a homogeneous vector field of degree d satisfies f(Xx) = \ d f(x) for 
any scalar A, and therefore the value of / on the unit sphere determines its value 
everywhere. It is also well-known that an asymptotically stable homogeneous 
system admits a homogeneous Lyapunov funciton [72], [146]. 

Naturally, questions regarding complexity of deciding asymptotic stability and 
questions about existence of Lyapunov functions are related. For instance, if one 
proves that for a class of polynomial vector fields, asymptotic stability implies 
existence of a polynomial Lyapunov function together with a computable upper 
bound on its degree, then the question of asymptotic stability for that class be- 
comes decidable. This is due to the fact that given any polynomial system and 
any integer d, the question of deciding whether the system admits a polynomial 
Lyapunov function of degree d can be answered in finite time using quantifier 
elimination. 

For the case of linear systems (i.e., homogeneous systems of degree 1), the 
situation is particularly nice. If such a system is asymptotically stable, then there 
always exists a quadratic Lyapunov function. Asymptotic stability of a linear 
system x = Ax is equivalent to the easily checkable algebraic criterion that the 
eigenvalues of A be in the open left half complex plane. Deciding this property of 
the matrix A can formally be done in polynomial time, e.g. by solving a Lyapunov 
equation [36]. 

Moving up in the degree, it is not difficult to show that if a homogeneous 
polynomial vector field has even degree, then it can never be asymptotically stable; 
see e.g. [72, p. 283]. So the next interesting case occurs for homogeneous vector 
fields of degree 3. We will prove below that determining asymptotic stability for 
such systems is strongly NP-hard. This gives a lower bound on the complexity 
of this problem. It is an interesting open question to investigate whether in this 
specific setting, the problem is also undecidable. 

One implication of our NP-hardness result is that unless P=NP, we should 
not expect sum of squares Lyapunov functions of "low enough" degree to always 
exist, even when the analysis is restricted to cubic homogeneous vector fields. The 
semidefinite program arising from a search for an sos Lyapunov function of degree 
2d for such a vector field in n variables has size in the order of Q+f) • This number 
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is polynomial in n for fixed d (but exponential in n when d grows linearly in n). 
Therefore, unlike the case of linear systems, we should not hope to have a bound 
on the degree of sos Lyapunov functions that is independent of the dimension. 

We postpone our study of existence of sos Lyapunov functions to Section 4.4 
and proceed for now with the following complexity result. 

Theorem 4.1. Deciding asymptotic stability of homogeneous cubic polynomial 
vector fields is strongly NP-hard. 

The main intuition behind the proof of this theorem is the following idea: We 
will relate the solution of a combinatorial problem not to the behavior of the 
trajectories of a cubic vector field that are hard to get a handle on, but instead to 
properties of a Lyapunov function that proves asymptotic stability of this vector 
field. As we will see shortly, insights from Lyapunov theory make the proof of 
this theorem quite simple. The reduction is broken into two steps: 

ONE-IN-THREE 3SAT 

I 

positivity of quartic forms 
I 

asymptotic stability of cubic vector fields 

In the course of presenting these reductions, we will also discuss some corollar- 
ies that are not directly related to our study of asymptotic stability, but are of 
independent interest. 

■ 4.2.1 Reduction from ONE-IN-THREE 3SAT to positivity of quartic forms 

As we remarked in Chapter 2, NP-hardness of deciding nonnegativity (i.e., posi- 
tive semidefiniteness) of quartic forms is well-known. The proof commonly cited in 
the literature is based on a reduction from the matrix copositivity problem [109]: 
given a symmetric n x n matrix Q, decide whether x T Qx > for all x's that 
are elementwise nonnegative. Clearly, a matrix Q is copositive if and only if the 
quartic form z T Qz, with Zi := xj, is nonnegative. The original reduction [109] 
proving NP-hardness of testing matrix copositivity is from the subset sum prob- 
lem and only establishes weak NP-hardness. However, reductions from the stable 
set problem to matrix copositivity are also known [56], [58] and they result in 
NP-hardness in the strong sense. Alternatively, strong NP-hardness of deciding 
nonnegativity of quartic forms follows immediately from NP-hardness of deciding 
convexity of quartic forms (proven in Chapter 2) or from NP-hardness of deciding 
nonnegativity of biquadratic forms (proven in [97]). 
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For reasons that will become clear shortly, we are interested in showing hard- 
ness of deciding positive definiteness of quartic forms as opposed to positive 
semidefiniteness. This is in some sense even easier to accomplish. A very straight- 
forward reduction from 3SAT proves NP-hardness of deciding positive definiteness 
of polynomials of degree 6. By using ONE-IN-THREE 3SAT instead, we will re- 
duce the degree of the polynomial from 6 to 4. 

Proposition 4.2. It is strongly 2 NP-hard to decide whether a homogeneous poly- 
nomial of degree 4 is positive definite. 

Proof. We give a reduction from ONE-IN-THREE 3SAT which is known to be 
NP-complete [61, p. 259]. Recall that in ONE-IN-THREE 3SAT, we are given a 
3SAT instance (i.e., a collection of clauses, where each clause consists of exactly 
three literals, and each literal is either a variable or its negation) and we are asked 
to decide whether there exists a {0, 1} assignment to the variables that makes the 
expression true with the additional property that each clause has exactly one true 
literal. 

To avoid introducing unnecessary notation, we present the reduction on a 
specific instance. The pattern will make it obvious that the general construction 
is no different. Given an instance of ONE-IN-THREE 3SAT, such as the following 

{xi V x 2 V Xi) A (x 2 V x 3 V x 5 ) A (xi V x 3 V x 5 ) A (x 1 Vi 3 V x 4 ), (4.7) 

we define the quartic polynomial p as follows: 

pi*) = Eti^(i-^) 2 

+(x, + (1 - x 2 ) +x 4 - l) 2 + ((1 - x 2 ) + (1 - x 3 ) +x 5 - l) 2 (4.8) 

+ ((1 - Xl ) +X 3 + (1- X 5 ) - l) 2 + ( Xl +X 3 + X 4 - l) 2 . 

Having done so, our claim is that p(x) > for all x G 1R 5 (or generally for all 
x E R n ) if and only if the ONE-IN-THREE 3SAT instance is not satisfiable. 
Note that p is a sum of squares and therefore nonnegative. The only possible 
locations for zeros of p are by construction among the points in {0, l} 5 . If there 
is a satisfying Boolean assignment x to (4.7) with exactly one true literal per 
clause, then p will vanish at point x. Conversely, if there are no such satisfying 
assignments, then for any point in {0, l} 5 , at least one of the terms in (4.8) will 
be positive and hence p will have no zeros. 

It remains to make p homogeneous. This can be done via introducing a new 
scalar variable y. If we let 

p h (x,y) =y 4 p{^), (4.9) 

2 Just like our results in Chapter 2, the NP-hardness results of this section will all be in the 
strong sense. From here on, we will drop the prefix "strong" for brevity. 
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then we claim that ph (which is a quartic form) is positive definite if and only if 
p constructed as in (4.8) has no zeros. 3 Indeed, if p has a zero at a point x, then 
that zero is inherited by ph at the point (x, 1). If p has no zeros, then (4.9) shows 
that ph can only possibly have zeros at points with y — 0. However, from the 
structure of p in (4.8) we see that 

Ph(x,0) —x\-\ \-x 4 5 , 

which cannot be zero (except at the origin). This concludes the proof. □ 

We present a simple corollary of the reduction we just gave on a problem 
that is of relevance in polynomial integer programming. 4 Recall from Chapter 2 
(Definition 2.7) that a basic semialgebraic set is a set defined by a finite number 
of polynomial inequalities: 

S = {xe R n \ fi(x) > 0, i = l,...,m}. (4.10) 

Corollary 4.3. Given a basic semialgebraic set, it is NP-hard to decide if the set 
contains a lattice point, i.e., a point with integer coordinates. This is true even 
when the set is defined by one constraint (m = 1) and the defining polynomial has 
degree 4. 

Proof. Given an instance of ONE-IN-THREE 3SAT, we define a polynomial p of 
degree 4 as in (4.8), and let the basic semialgebraic set be given by 

S = {x e R n \ -p(x) > 0}. 

Then, by Proposition 4.2, if the ONE-IN-THREE 3SAT instance is not satisfiable, 
the set S is empty and hence has no lattice points. Conversely, if the instance is 
satisfiable, then S contains at least one point belonging to {0,l} n and therefore 
has a lattice point. □ 

By using the celebrated result on undecidability of checking existence of integer 
solutions to polynomial equations (Hilbert's 10th problem), one can show that the 
problem considered in the corollary above is in fact undecidable [129]. The same 
is true for quadratic integer programming when both the dimension n and the 

3 In general, homogenization does not preserve positivity. For example, as shown in [138], the 
polynomial x 2 + (l — X1X2) 2 has no zeros, but its homogenization x 2 y 2 + (y 2 ~ X1X2) 2 has zeros at 
the points (1,0, 0) T and (0,1, 0) T . Nevertheless, positivity is preserved under homogenization 
for the special class of polynomials constructed in this reduction, essentially because polynomials 
of type (4.8) have no zeros at infinity. 

4 We are thankful to Amitabh Basu and Jesus De Loera for raising this question during a 
visit at UC Davis, and for later insightful discussions. 
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number of constraints m are allowed to grow as part of the input [84]. The 
question of deciding existence of lattice points in polyhedra (i.e., the case where 
degree of fi in (4.10) is 1 for all i) is also interesting and in fact very well-studied. 
For polyhedra, if both n and m are allowed to grow, then the problem is NP- 
hard. This can be seen e.g. as a corollary of the NP-hardness of the INTEGER 
KNAPSACK problem (though this is NP-hardness in the weak sense); see [61, p. 
247] . However, if n is fixed and m grows, it follows from a result of Lenstra [94] 
that the problem can be decided in polynomial time. The same is true if m is 
fixed and n grows [153, Cor. 18.7c]. See also [115]. 

■ 4.2.2 Reduction from positivity of quartic forms to asymptotic stability 
of cubic vector fields 

We now present the second step of the reduction and finish the proof of Theo- 
rem 4.1. 

Proof of Theorem We give a reduction from the problem of deciding posi- 
tive definiteness of quartic forms, whose NP-hardness was established in Propo- 
sition 4.2. Given a quartic form V : = V(x), we define the polynomial vector 
field 

x = -VV{x). (4.11) 

Note that the vector field is homogeneous of degree 3. We claim that the above 
vector field is (locally or equivalently globally) asymptotically stable if and only 
if V is positive definite. First, we observe that by construction 

V(x) = (W(x),x) = -||W(:r)|| 2 < 0. (4.12) 

Suppose V is positive definite. By Euler's identity for homogeneous functions, 5 
we have V(x) = \x T W(x). Therefore, positive definiteness of V implies that 
W(x) cannot vanish anywhere except at the origin. Hence, V(x) < for all 
x 7^ 0. In view of Lyapunov's theorem (see e.g. [86, p. 124]), and the fact that a 
positive definite homogeneous function is radially unbounded, it follows that the 
system in (4.11) is globally asymptotically stable. 

For the converse direction, suppose (4.11) is GAS. Our first claim is that global 
asymptotic stability together with V(x) < implies that V must be positive 
semidefinite. This follows from the following simple argument, which we have 
also previously presented in [12] for a different purpose. Suppose for the sake 
of contradiction that for some x G K n and some e > 0, we had V(x) = — e < 0. 

5 Euler's identity is easily derived by differentiating both sides of the equation 
V(\x) = X d V(x) with respect to A and setting A = 1. 
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Consider a trajectory x(t; x) of system (4.11) that starts at initial condition x, and 
let us evaluate the function V on this trajectory. Since V(x) = — e and V(x) < 0, 
we have V(x(t; x)) < — e for all t > 0. However, this contradicts the fact that by 
global asymptotic stability, the trajectory must go to the origin, where V, being 
a form, vanishes. 

To prove that V is positive definite, suppose by contradiction that for some 
nonzero point x* G M n we had V(x*) = 0. Since we just proved that V has to be 
positive semidefinite, the point x* must be a global minimum of V. Therefore, 
as a necessary condition of optimality, we should have W(x*) = 0. But this 
contradicts the system in (4.11) being GAS, since the trajectory starting at x* 
stays there forever and can never go to the origin. □ 

Perhaps of independent interest, the reduction we just gave suggests a method 
for proving positive definiteness of forms. Given a form V, we can construct a 
dynamical system as in (4.11), and then any method that we may have for proving 
stability of vector fields (e.g. the use of various kinds of Lyapunov functions) can 
serve as an algorithm for proving positivity of V. In particular, if we use a 
polynomial Lyapunov function W to prove stability of the system in (4.11), we 
get the following corollary. 

Corollary 4.4. Let V and W be two forms of possibly different degree. If W is 
positive definite, and (VW, W) is positive definite, then V is positive definite. 

One interesting fact about this corollary is that its algebraic version with sum 
of squares replaced for positivity is not true. In other words, we can have W sos 
(and positive definite), (VW, VV) sos (and positive definite), but V not sos. This 
gives us a way of proving positivity of some polynomials that are not sos, using 
only sos certificates. Given a form V, since the expression (VW W) is linear in 
the coefficients of W, we can use sos programming to search for a form W that 
satisfies W sos and (V W, W) sos, and this would prove positivity of V. The 
following example demonstrates the potential usefulness of this approach. 

Example 4.2.1. Consider the following form of degree 6: 

V(x) = x\x\ + x\x\ - Zx\x\xl + xl + ^(xl + x 2 2 + x 2 3 ) 3 . (4. 13) 

One can check that this polynomial is not a sum of squares. (In fact, this is the 
Motzkin form presented in equation (3.1) of Chapter 3 slightly perturbed.) On the 
other hand, we can use YALMIP [98] together with the SDP solver SeDuMi [157] 
to search for a form W satisfying 



W sos 
(VW,W) sos. 



(4.14) 
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If we parameterize W as a quadratic form, no feasible solution will be returned 
form the solver. However, when we increase the degree of W from 2 to 4, the 
solver returns the following polynomial 

W(x) = 9xj + 9xj — 6xlx% + Qx\x\ + 6xlxl + 3^3 — x\x 2 — x x x\ 

— xfx3 — ?>x\x2Xz — 3^1X2X3 — x\x% — AxiX2x\ — X\x\ — X 2 x\ 

that satisfies both sos constrains in (4.14). The Gram matrices in these sos 
decompositions are positive definite. Therefore, W and (VW 7 , W) are positive 
definite forms. Hence, by Corollary 4.4, we have a proof that V in (4.13) is 
positive definite. A 

Interestingly, approaches of this type that use gradient information for proving 
positivity of polynomials with sum of squares techniques have been studied by 
Nie, Demmel, and Sturmfels in [113], though the derivation there is not Lyapunov- 
inspired. 



I 4.3 Non-existence of polynomial Lyapunov functions 

As we mentioned at the beginning of this chapter, the question of global asymp- 
totic stability of polynomial vector fields is commonly addressed by seeking a 
Lyapunov function that is polynomial itself. This approach has become further 
prevalent over the past decade due to the fact that we can use sum of squares 
techniques to algorithmically search for such Lyapunov functions. The question 
therefore naturally arises as to whether existence of polynomial Lyapunov func- 
tions is necessary for global stability of polynomial systems. In this section, we 
give a negative answer to this question by presenting a remarkably simple coun- 
terexample. In view of the fact that globally asymptotically stable linear systems 
always admit quadratic Lyapunov functions, it is quite interesting to observe that 
the following vector field that is arguably "the next simplest system" to consider 
does not admit a polynomial Lyapunov function of any degree. 

Theorem 4.5. Consider the polynomial vector field 

± = ~ X + Xy (4.15) 

y = -y- 

The origin is a globally asymptotically stable equilibrium point, but the system 
does not admit a polynomial Lyapunov function. 

Proof. Let us first show that the system is GAS. Consider the Lyapunov function 



V(x,y) = \n(l+x 2 )+y 2 , 
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Figure 4.1. Typical trajectories of the vector field in (4.15) starting from initial conditions in 
the nonnegative orthant. 



which clearly vanishes at the origin, is strictly positive for all (x,y) ^ (0,0), and 
is radially unbounded. The derivative of V(x,y) along the trajectories of (4.15) 
is given by 

V(x,y) = %x+%y 

_ 2x 2 (y-l) _ 9 o 

— 1+x 2 *y 

x 2 +2y 2 +x 2 y 2 +{x—xy) 2 
~ 1+x 2 ' 

which is obviously strictly negative for all (x,y) (0,0). In view of Lyapunov's 
stability theorem (see e.g. [86, p. 124]), this shows that the origin is globally 
asymptotically stable. 

Let us now prove that no positive definite polynomial Lyapunov function (of 
any degree) can decrease along the trajectories of system (4.15). The proof will 
be based on simply considering the value of a candidate Lyapunov function at 
two specific points. We will look at trajectories on the nonnegative orthant, with 
initial conditions on the line (k, ak) for some constant a > 0, and then observe 
the location of the crossing of the trajectory with the horizontal line y = a. We 
will argue that by taking k large enough, the trajectory will have to travel "too 
far east" (see Figure 4.1) and this will make it impossible for any polynomial 
Lyapunov function to decrease. 

To do this formally, we start by noting that we can explicitly solve for the 
solution (x(t),y(t)) of the vector field in (4.15) starting from any initial condition 
(*(0),y(0)): 
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x(t) = x{0)e^~y^ e 
y(t) = y(0)e 

Consider initial conditions 



(4-16) 



(x(0),y(0)) = (k,ak) 

parameterized by k > 1 and for some fixed constant a > 0. From the explicit 
solution in (4.f6) we have that the time t* it takes for the trajectory to cross the 
line y = a is 

t* = ln(k), 

and that the location of this crossing is given by 

(x(t*),y(n) = (e^ k - 1 \a). 

Consider now any candidate nonnegative polynomial function V(x,y) that de- 
pends on both x and y (as any Lyapunov function should). Since k > 1 (and 
thus, t* > 0), for V(x,y) to be a valid Lyapunov function, it must satisfy 
V(x(t*),y(t*))<V(x(0),y(0)), i.e., 

\/(e a(fc - 1} ,a) < V(k, ak). 

However, this inequality cannot hold for k large enough, since for a generic fixed 
a, the left hand side grows exponentially in k whereas the right hand side grows 
only polynomially in k. The only subtlety arises from the fact that V(e a( - k ~ 1 \ a) 
could potentially be a constant for some particular choices of a. However, for 
any polynomial V(x,y) with nontrivial dependence on y, this may happen for at 
most finitely many values of a. Therefore, any generic choice of a would make 
the argument work. □ 

Example of Bacciotti and Rosier. After our counterexample above was submitted 
for publication, Christian Ebenbauer brought to our attention an earlier coun- 
terexample of Bacciotti and Rosier [22, Prop. 5.2] that achieves the same goal 
(though by using irrational coefficients). We will explain the differences between 
the two examples below. At the time of submission of our result, we were under 
the impression that no such examples were known, partly because of a recent 
reference in the controls literature that ends its conclusion with the following 
statement [126], [127]: 

"Still unresolved is the fundamental question of whether globally stable vector 
fields will also admit sum-of-squares Lyapunov functions." 



100 



CHAPTER 4. LYAPUNOV ANALYSIS OF POLYNOMIAL DIFFERENTIAL EQUATIONS 



In [126], [127], what is referred to as a sum of squares Lyapunov function (in 
contrast to our terminology here) is a Lyapunov function that is a sum of squares, 
with no sos requirements on its derivative. Therefore, the fundamental question 
referred to above is on existence of a polynomial Lyapunov function. If one were 
to exist, then we could simply square it to get another polynomial Lyapunov 
function that is a sum of squares (see Lemma 4.6). 

The example of Bacciotti and Rosier is a vector field in 2 variables and degree 
5 that is GAS but has no polynomial (and no analytic) Lyapunov function even 
around the origin. Their very clever construction is complementary to our exam- 
ple in the sense that what creates trouble for existence of polynomial Lyapunov 
functions in our Theorem 4.5 is growth rates arbitrarily far away from the ori- 
gin, whereas the problem arising in their example is slow decay rates arbitrarily 
close to the origin. The example crucially relies on a parameter that appears as 
part of the coefficients of the vector field being irrational. (Indeed, one easily 
sees that if that parameter is rational, their vector field does admit a polynomial 
Lyapunov function.) In practical applications where computational techniques 
for searching over Lyapunov functions on finite precision machines are used, such 
issues with irrationality of the input cannot occur. By contrast, the example in 
(4.15) is much less contrived and demonstrates that non-existence of polynomial 
Lyapunov functions can happen for extremely simple systems that may very well 
appear in applications. 

In [125], Peet has shown that locally exponentially stable polynomial vector 
fields admit polynomial Lyapunov functions on compact sets. The example of 
Bacciotti and Rosier implies that the assumption of exponential stability indeed 
cannot be dropped. 

I 4.4 (Non)-existence of sum of squares Lyapunov functions 

In this section, we suppose that the polynomial vector field at hand admits a 
polynomial Lyapunov function, and we would like to investigate whether such a 
Lyapunov function can be found with sos programming. In other words, we would 
like to see whether the constrains in (4.5) and (4.6) are more conservative than the 
true Lyapunov inequalities in (4.2) and (4.3). We think of the sos Lyapunov con- 
ditions in (4.5) and (4.6) as sufficient conditions for the strict inequalities in (4.2) 
and (4.3) even though sos decomposition in general merely guarantees non-strict 
inequalities. The reason for this is that when an sos feasibility problem is strictly 
feasible, the polynomials returned by interior point algorithms are automatically 
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positive definite (see [1, p. 41] for more discussion). 6 

We shall emphasize that existence of nonnegative polynomials that are not 
sums of squares does not imply on its own that the sos conditions in (4.5) and 
(4.6) are more conservative than the Lyapunov inequalities in (4.2) and (4.3). 
Since Lyapunov functions are not in general unique, it could happen that within 
the set of valid polynomial Lyapunov functions of a given degree, there is always 
at least one that satisfies the sos conditions (4.5) and (4.6). Moreover, many of the 
known examples of nonnegative polynomials that are not sos have multiple zeros 
and local minima [138] and therefore cannot serve as Lyapunov functions. Indeed, 
if a function has a local minimum other than the origin, then its value evaluated 
on a trajectory starting from the local minimum would not be decreasing. 

I 4.4.1 A motivating example 

The following example will help motivate the kind of questions that we are ad- 
dressing in this section. 

Example 4.4.1. Consider the dynamical system 

x\ = -0.15^ + 200x^2-10.5x^1 - 807^ 
+Uxfxj + 600xjxl - 3.5xi^ + 9x 7 2 

(4.17) 

x 2 = —9x\ — 3.5x®x 2 — 600x^X2 + 14x^x| 

+807xfx| - 10.5x?x^ - 200xix^ - 0.15x|. 

A typical trajectory of the system that starts from the initial condition xo = 
(2, 2) T is plotted in Figure 4.2. Our goal is to establish global asymptotic stability 
of the origin by searching for a polynomial Lyapunov function. Since the vector 
field is homogeneous, the search can be restricted to homogeneous Lyapunov 
functions [72], [146]. To employ the sos technique, we can use the software package 
SOSTOOLS [132] to search for a Lyapunov function satisfying the sos conditions 
(4.5) and (4.6). However, if we do this, we will not find any Lyapunov functions 
of degree 2, 4, or 6. If needed, a certificate from the dual semidefinite program 
can be obtained, which would prove that no polynomial of degree up to 6 can 
satisfy the sos requirements (4.5) and (4.6). 

At this point we are faced with the following question. Does the system 
really not admit a Lyapunov function of degree 6 that satisfies the true Lyapunov 

6 We expect the reader to recall the basic definitions and concepts from Subsection 3.2.1 of 
the previous chapter. Throughout, when we say a Lyapunov function (or the negative of its 
derivative) is positive definite, we mean that it is positive everywhere except possibly at the 
origin. 
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Figure 4.2. A typical trajectory of the vector filed in Example 4.4. f (solid), level sets of a 
degree 8 polynomial Lyapunov function (dotted). 

inequalities in (4.2), (4.3)? Or is the failure due to the fact that the sos conditions 
in (4.5), (4.6) are more conservative? 

Note that when searching for a degree 6 Lyapunov function, the sos constraint 
in (4.5) is requiring a homogeneous polynomial in 2 variables and of degree 6 to be 
a sum of squares. The sos condition (4.6) on the derivative is also a condition on 
a homogeneous polynomial in 2 variables, but in this case of degree 12. (This is 
easy to see from V = (VV, /).) Recall from Theorem 3.1 of the previous chapter 
that nonnegativity and sum of squares are equivalent notions for homogeneous 
bivariate polynomials, irrespective of the degree. Hence, we now have a proof 
that this dynamical system truly does not have a Lyapunov function of degree 6 
(or lower). 

This fact is perhaps geometrically intuitive. Figure 4.2 shows that the tra- 
jectory of this system is stretching out in 8 different directions. So, we would 
expect the degree of the Lyapunov function to be at least 8. Indeed, when we in- 
crease the degree of the candidate function to 8, SOSTOOLS and the SDP solver 
SeDuMi [157] succeed in finding the following Lyapunov function: 

V(x) = 0.02^ + 0.015x^2 + 1.743x^-0.106^ 
-3.517:44 + 0.106x1x1 + 1.743^ 
-0.015xi4 + 0.02x 8 2 . 

The level sets of this Lyapunov function are plotted in Figure 4.2 and are clearly 
invariant under the trajectory. A 
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I 4.4.2 A counterexample 

Unlike the scenario in the previous example, we now show that a failure in find- 
ing a Lyapunov function of a particular degree via sum of squares programming 
can also be due to the gap between nonnegativity and sum of squares. What 
will be conservative in the following counterexample is the sos condition on the 
derivative. 7 

Consider the dynamical system 

x\ = —xfxl + 2x\ x 2 — x\ + Ax\x\ — %x\x 2 + 4x\ 



—X\x\ + 4x x x\ — 4:X 1 + 10x2 

x 2 = -9xjx 2 + 10xf + 2xixl - 8x ± xj - 4x 1 - x\ 
+4xj ~ 4^2- 



(4.18) 



One can verify that the origin is the only equilibrium point for this system, and 
therefore it makes sense to investigate global asymptotic stability. If we search for 
a quadratic Lyapunov function for (4.18) using sos programming, we will not find 
one. It will turn out that the corresponding semidefinite program is infeasible. 
We will prove shortly why this is the case, i.e, why no quadratic function V can 
satisfy 

I S ° S (4.19) 
— V sos. 

Nevertheless, we claim that 

V(x) = \x\ + \x\ (4.20) 
is a valid Lyapunov function. Indeed, one can check that 

V(x) = x lXl + x 2 x 2 = -M{xi - 1, x 2 - 1), (4.21) 
where M(xi ) x 2 ) is the Motzkin polynomial [107]: 

M(xi,x 2 ) = x x x 2 + x x x 2 — 'ix A x 2 + 1. 

This polynomial is just a dehomogenized version of the Motzkin form presented 
before, and it has the property of being nonnegative but not a sum of squares. 
The polynomial V is strictly negative everywhere, except for the origin and three 
other points (0,2) T , (2,0) T , and (2,2) T , where V is zero. However, at each of 
these three points we have x ^ 0. Once the trajectory reaches any of these three 
points, it will be kicked out to a region where V is strictly negative. Therefore, 
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(a) Shifted Motzkin polynomial is 
nonnegative but not sos. 




-3-2-10123 -3-2-10123 



X 1 X 1 

(b) Typical trajectories of (4.18) (c) Level sets of a quartic Lya- 
(solid) , level sets of V (dotted) . punov function found through 

sos programming. 

Figure 4.3. The quadratic polynomial valid Lyapunov function for the vector 

field in (4.18) but it is not detected through sos programming. 



by LaSalle's invariance principle (see e.g. [86, p. 128]), the quadratic Lyapunov 
function in (4.20) proves global asymptotic stability of the origin of (4.18). 

The fact that V is zero at three points other than the origin is not the reason 
why sos programming is failing. After all, when we impose the condition that —V 
should be sos, we allow for the possibility of a non-strict inequality. The reason 
why our sos program does not recognize (4.20) as a Lyapunov function is that the 
shifted Motzkin polynomial in (4.21) is nonnegative but it is not a sum of squares. 
This sextic polynomial is plotted in Figure 4.3(a). Trajectories of (4.18) starting 
at (2, 2) T and (—2.5, — 3) T along with level sets of V are shown in Figure 4.3(b). 

So far, we have shown that V in (4.20) is a valid Lyapunov function but does 
not satisfy the sos conditions in (4.19). We still need to show why no other 



7 This counterexample has appeared in our earlier work [1] but not with a complete proof. 
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quadratic Lyapunov function 

U (x) = ci%\ + c 2 xix 2 + C3X2 (4.22) 

can satisfy the sos conditions either. 8 We will in fact prove the stronger statement 
that V in (4.20) is the only valid quadratic Lyapunov function for this system up 
to scaling, i.e., any quadratic function U that is not a scalar multiple of \x\ + \x\ 
cannot satisfy U > and —U > 0. It will even be the case that no such U can 
satisfy — U > alone. (The latter fact is to be expected since global asymptotic 
stability of (4.18) together with — U > would automatically imply U > 0; 
see [12, Theorem 1.1].) 

So, let us show that — U > implies U is a scalar multiple of \x\ + \x\. 
Because Lyapunov functions are closed under positive scalings, without loss of 
generality we can take c\ — 1. One can check that 

-£7(0,2) = -80c 2 , 

so to have — U > 0, we need c 2 < 0. Similarly, 

-17(2,2) = -288ci + 288c 3 , 

which implies that c 3 > 1. Let us now look at 

-U(xi,l) = -c 2 x\ + 10c 2 xj + 2c 2 xi - 10c 2 - 2c 3 xf (4. 21) 

+20c 3 xi + 2c 3 + 2x\ - 20xi. [ ' ' 

If we let X\ — > —00, the term —c 2 x\ dominates this polynomial. Since c 2 < 
and — U > 0, we conclude that c 2 = 0. Once c 2 is set to zero in (4.23), the 
dominating term for x\ large will be (2 — 2c 3 )a;f. Therefore to have — U(x\, 1) > 
as xi — >■ ±00 we must have c 3 < 1. Hence, we conclude that c\ = 1, c 2 = 0, c 3 = 1, 
and this finishes the proof. 

Even though sos programming failed to prove stability of the system in (4.18) 
with a quadratic Lyapunov function, if we increase the degree of the candidate 
Lyapunov function from 2 to 4, then SOSTOOLS succeeds in finding a quartic 
Lyapunov function 

W(x) = 0.08^-0.04^ + 0.13x^ + 0.03x^2 
+0.13^ + 0.04xi^ - 0.15:rix 2 
+0.07x^-0.01x^ + 0.12x2, 

8 Since we can assume that the Lyapunov function U and its gradient vanish at the origin, 
linear or constant terms are not needed in (4.22). 
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which satisfies the sos conditions in (4.19). The level sets of this function are 
close to circles and are plotted in Figure 4.3(c). 

Motivated by this example, it is natural to ask whether it is always true that 
upon increasing the degree of the Lyapunov function one will find Lyapunov func- 
tions that satisfy the sum of squares conditions in (4.19). In the next subsection, 
we will prove that this is indeed the case, at least for planar systems such as the 
one in this example, and also for systems that are homogeneous. 

I 4.4.3 Converse sos Lyapunov theorems 

In [126], [127], it is shown that if a system admits a polynomial Lyapunov function, 
then it also admits one that is a sum of squares. However, the results there do 
not lead to any conclusions as to whether the negative of the derivative of the 
Lyapunov function is sos, i.e, whether condition (4.6) is satisfied. As we remarked 
before, there is therefore no guarantee that the semidefinite program can find such 
a Lyapunov function. Indeed, our counterexample in the previous subsection 
demonstrated this very phenomenon. 

The proof technique used in [126], [127] is based on approximating the solution 
map using the Picard iteration and is interesting in itself, though the actual 
conclusion that a Lyapunov function that is sos exists has a far simpler proof 
which we give in the next lemma. 

Lemma 4.6. If a polynomial dynamical system has a positive definite polynomial 
Lyapunov function V with a negative definite derivative V , then it also admits a 
positive definite polynomial Lyapunov function W which is a sum of squares. 

Proof. Take W = V 2 . The negative of the derivative — W = —2VV is clearly 
positive definite (though it may not be sos). □ 

We will next prove a converse sos Lyapunov theorem that guarantees the 
derivative of the Lyapunov function will also satisfy the sos condition, though 
this result is restricted to homogeneous systems. The proof of this theorem relies 
on the following Positivstellensatz result due to Scheiderer. 

Theorem 4.7 (Scheiderer, [151]). Given any two positive definite homogeneous 
polynomials p and q, there exists an integer k such that pq k is a sum of squares. 

Theorem 4.8. Given a homogeneous polynomial vector field, suppose there exists 
a homogeneous polynomial Lyapunov function V such that V and — V are positive 
definite. Then, there also exists a homogeneous polynomial Lyapunov function W 
such that W is sos and —W is sos. 
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Proof. Observe that V 2 and —2VV are both positive definite and homogeneous 
polynomials. Applying Theorem 4.7 to these two polynomials, we conclude the 
existence of an integer k such that (— 2VV)(V 2 ) k is sos. Let 

W = V 2k+2 . 

Then, W is clearly sos since it is a perfect even power. Moreover, 
-W = -{2k + 2)V 2k+1 V = —(k + l)2V 2k VV 

is also sos by the previous claim. 9 □ 

Next, we develop a similar theorem that removes the homogeneity assumption 
from the vector field, but instead is restricted to vector fields on the plane. For 
this, we need another result of Scheiderer. 

Theorem 4.9 (Scheiderer, [150, Cor. 3.12]). Let p := p(x±, x 2 , £3) and q : = 

q(xi,X2,xs) be two homogeneous polynomials in three variables, with p positive 
semidefinite and q positive definite. Then, there exists an integer k such that pq k 
is a sum of squares. 

Theorem 4.10. Given a (not necessarily homogeneous) polynomial vector field 
in two variables, suppose there exists a positive definite polynomial Lyapunov 
function V, with —V positive definite, and such that the highest order term of V 
has no zeros 10 . Then, there also exists a polynomial Lyapunov function W such 
that W is sos and —W is sos. 

Proof. Let V — V + l. So, V — V. Consider the (non-homogeneous) polynomials 

V 2 and — 2VV in the variables x := (xi,x 2 ). Let us denote the (even) degrees of 
these polynomials respectively by di and d 2 . Note that V 2 is nowhere zero and 

—2VV is only zero at the origin. Our first step is to homogenize these polynomials 
by introducing a new variable y. Observing that the homogenization of products 
of polynomials equals the product of homogenizations, we obtain the following 
two trivariate forms: 

y 2dl V 2 (*-), (4.24) 

-2yV 2 ^)^)- (4.25) 

9 Note that W constructed in this proof proves GAS since — W is positive definite and W 
itself being homogeneous and positive definite is automatically radially unbounded. 

10 This requirement is only slightly stronger than the requirement of radial unboundedness, 
which is imposed on V by Lyapunov's theorem anyway. 
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Since by assumption the highest order term of V has no zeros, the form in (4.24) 
is positive definite . The form in (4.25), however, is only positive semidefinite. 

In particular, since V = V has to vanish at the origin, the form in (4.25) has a 
zero at the point (xi,X2,y) = (0,0, 1). Nevertheless, since Theorem 4.9 allows for 
positive semidefiniteness of one of the two forms, by applying it to the forms in 
(4.24) and (4.25), we conclude that there exists an integer k such that 

_ 2 y d ^ 2k + 1 )y^V(^)V(^)V 2k (^) (4.26) 

is sos. Let W = V 2k+2 . Then, W is clearly sos. Moreover, 

—W = -{2k + 2)V 2k+l V = —(k + l)2V 2k VV 
is also sos because this polynomial is obtained from (4.26) by setting y — l. 11 □ 

H 4.5 Existence of sos Lyapunov functions for switched linear systems 

The result of Theorem 4.8 extends in a straightforward manner to Lyapunov 
analysis of switched systems. In particular, we are interested in the highly-studied 
problem of stability analysis of arbitrary switched linear systems: 

x = AiX, ie{l,...,m}, (4.27) 

Ai E ]R nxn . We assume the minimum dwell time of the system is bounded away 
from zero. This guarantees that the solutions of (4.27) are well-defined. Existence 
of a common Lyapunov function is necessary and sufficient for (global) asymp- 
totic stability under arbitrary switching (ASUAS) of system (4.27). The ASUAS 
of system (4.27) is equivalent to asymptotic stability of the linear differential 
inclusion 

x E co{Ai}x, i E {1, . . . , to}, 

where co here denotes the convex hull. It is also known that ASUAS of (4.27) 
is equivalent to exponential stability under arbitrary switching [15]. A common 
approach for analyzing the stability of these systems is to use the sos technique 
to search for a common polynomial Lyapunov function [131], [42]. We will prove 
the following result. 

11 Once again, we note that the function W constructed in this proof is radially unbounded, 
achieves its global minimum at the origin, and has — W positive definite. Therefore, W proves 
global asymptotic stability. 
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Theorem 4.11. The switched linear system in (4-27) is asymptotically stable 
under arbitrary switching if and only if there exists a common homogeneous poly- 
nomial Lyapunov function W such that 

W sos 

-Wi = -(VW(x), Aix) sos, 

for i — 1, . . . , m, where the polynomials W and —Wi are all positive definite. 

To prove this result, we will use the following theorem of Mason et al. 

Theorem 4.12 (Mason et al., [103]). If the switched linear system in (4-27) 
is asymptotically stable under arbitrary switching, then there exists a common 
homogeneous polynomial Lyapunov function V such that 

V > Vi^O 
-V i (x) = -{W(x),A i x) > Vx^O, 

for i — 1, . . . , m. 

The next proposition is an extension of Theorem 4.8 to switched systems (not 
necessarily linear). 

Proposition 4.13. Consider an arbitrary switched dynamical system 

x = fi(x), i E {l,...,m}, 

where fi(x) is a homogeneous polynomial vector field of degree di (the degrees of 
the different vector fields can be different). Suppose there exists a common positive 
definite homogeneous polynomial Lyapunov function V such that 

-V i (x) = -{VV(x)J i (x)) 

is positive definite for all i G {1, . . . ,m}. Then there exists a common homoge- 
neous polynomial Lyapunov function W such that W is sos and the polynomials 

-W i = -(VW(x),f i (x)), 

for all i G {1, . . . , m}, are also sos. 

Proof. Observe that for each i, the polynomials V 2 and —2VV.i are both positive 
definite and homogeneous. Applying Theorem 4.7 m times to these pairs of 
polynomials, we conclude the existence of positive integers ki such that 

(-2VVi)(V 2 ) k < is sos, (4.28) 
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for i — 1, . . . , m. Let 

k = max{fci, . . . , k m }, 

and let 

W = V 2h+2 . 

Then, W is clearly sos. Moreover, for each i, the polynomial 

-Wi = -{2k + 2)V 2k+1 Vi 

= —(k + l)2VViV 2ki V 2 ^- k ^ 

is sos since (— 2VVi)(V 2hi ) is sos by (4.28), \Z 2 ( fc - fe <) is sos as an even power, and 
products of sos polynomials are sos. □ 

The proof of Theorem 4.11 now simply follows from Theorem 4.12 and Propo- 
sition 4.13 in the special case where di — 1 for all %. 

Analysis of switched linear systems is also of great interest to us in discrete 
time. In fact, the subject of the next chapter will be on the study of systems of 
the type 

x k+1 = A { x k , ie{l,...,m}, (4.29) 

where at each time step the update rule can be given by any of the m matrices 
Ai. The analogue of Theorem 4.11 for these systems has already been proven by 
Parrilo and Jadbabaie in [122]. It is shown that if (4.29) is asymptotically stable 
under arbitrary switching, then there exists a homogeneous polynomial Lyapunov 
function W such that 

W(x) sos 
W(x) - W(Aix) sos, 

for % = 1, ... ,m. We will end this section by proving two related propositions of a 
slightly different flavor. It will be shown that for switched linear systems, both in 
discrete time and in continuous time, the sos condition on the Lyapunov function 
itself is never conservative, in the sense that if one of the "decrease inequali- 
ties" is sos, then the Lyapunov function is automatically sos. These propositions 
are really statements about linear systems, so we will present them that way. 
However, since stable linear systems always admit quadratic Lyapunov functions, 
the propositions are only interesting in the context where a common polynomial 
Lyapunov function for a switched linear system is seeked. 

Proposition 4.14. Consider the linear dynamical system x k+ i = Ax k in discrete 
time. Suppose there exists a positive definite polynomial Lyapunov function V 
such that V(x) — V(Ax) is positive definite and sos. Then, V is sos. 
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Proof. Consider the polynomial V(x) — V(Ax) that is sos by assumption. If we 
replace x by Ax in this polynomial, we conclude that the polynomial V(Ax) — 
V(A 2 x) is also sos. Hence, by adding these two sos polynomials, we get that 
V(x) — V(A 2 x) is sos. This procedure can obviously be repeated to infer that for 
any integer k > 1, the polynomial 

V{x) - V(A k x) (4.30) 

is sos. Since by assumption V and V(x) — V(Ax) are positive definite, the linear 
system must be GAS, and hence A k converges to the zero matrix as k — > oo. 
Observe that for all k, the polynomials in (4.30) have degree equal to the degree 
of V, and that the coefficients of V(x) — V(A k x) converge to the coefficients of V 
as k — > oo. Since for a fixed degree and dimension the cone of sos polynomials is 
closed [141], it follows that V is sos. □ 

Similarly, in continuous time, we have the following proposition. 

Proposition 4.15. Consider the linear dynamical system x = Ax in continuous 
time. Suppose there exists a positive definite polynomial Lyapunov function V 
such that —V = —(W(x),Ax) is positive definite and sos. Then, V is sos. 

Proof. The value of the polynomial V along the trajectories of the dynamical 
system satisfies the relation 

V(x(t)) = V(x(0))+ fv(x(r))d T . 

J o 

Since the assumptions imply that the system is GAS, V(x(t)) — > as t goes to 
infinity. (Here, we are assuming, without loss of generality, that V vanishes at 
the origin.) By evaluating the above equation at t — oo, rearranging terms, and 
substituting e Ar x for the solution of the linear system at time r starting at initial 
condition x, we obtain 

l>O0 

V(x) = / -V{e Ar x)d T . 
Jo 

By assumption, — V is sos and therefore for any value of r, the integrand — V(e Ar x) 
is an sos polynomial. Since converging integrals of sos polynomials are sos, it 
follows that V is sos. □ 

Remark 4.5.1. The previous proposition does not hold if the system is not linear. 
For example, consider any positive form V that is not a sum of squares and define 
a dynamical system by x = — W(x). In this case, both V and —V = HVV^x)!! 2 
are positive definite and —V is sos, though V is not sos. 
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■ 4.6 Some open questions 

Some open questions related to the problems studied in this chapter are the fol- 
lowing. Regarding complexity, of course the interesting problem is to formally 
answer the questions of Arnold on undecidability of determining stability for 
polynomial vector fields. Regarding existence of polynomial Lyapunov functions, 
Mark Tobenkin asked whether a globally exponentially stable polynomial vector 
field admits a polynomial Lyapunov function. Our counterexample in Section 4.3, 
though GAS and locally exponentially stable, is not globally exponentially stable 
because of exponential growth rates in the large. The counterexample of Bac- 
ciotti and Rosier in [22] is not even locally exponentially stable. Another future 
direction is to prove that GAS homogeneous polynomial vector fields admit ho- 
mogeneous polynomial Lyapunov functions. This, together with Theorem 4.8, 
would imply that asymptotic stability of homogeneous polynomial systems can 
always be decided via sum of squares programming. Also, it is not clear to us 
whether the assumption of homogeneity and planarity can be removed from The- 
orems 4.8 and 4.10 on existence of sos Lyapunov functions. Finally, another 
research direction would be to obtain upper bounds on the degree of polynomial 
or sos polynomial Lyapunov functions. Some degree bounds are known for Lya- 
punov analysis of locally exponentially stable systems [127], but they depend on 
uncomputable properties of the solution such as convergence rate. Degree bounds 
on Positivstellensatz result of the type in Theorems 4.7 and 4.9 are known, but 
typically exponential in size and not very encouraging for practical purposes. 



Chapter 5 



Joint Spectral Radius and 
Path-Complete Graph Lyapunov 

Functions 



In this chapter, we introduce the framework of path-complete graph Lyapunov 
functions for analysis of switched systems. The methodology is presented in the 
context of approximation of the joint spectral radius. The content of this chapter 
is based on an extended version of the work in [3]. 

I 5.1 Introduction 

Given a finite set of square matrices A : = {A ± , A m }, their joint spectral radius 
p(A) is defined as 

p(^4)=lim max \\A ak ..A a2 A ai \\ 1/k , (5.1) 

fe^OO cre {l,..., m } fc 

where the quantity p(A) is independent of the norm used in (5.1). The joint 
spectral radius (JSR) is a natural generalization of the spectral radius of a single 
square matrix and it characterizes the maximal growth rate that can be obtained 
by taking products, of arbitrary length, of all possible permutations of A±, A m . 
This concept was introduced by Rota and Strang [147] in the early 60s and has 
since been the subject of extensive research within the engineering and the math- 
ematics communities alike. Aside from a wealth of fascinating mathematical 
questions that arise from the JSR, the notion emerges in many areas of applica- 
tion such as stability of switched linear dynamical systems, computation of the 
capacity of codes, continuity of wavelet functions, convergence of consensus algo- 
rithms, trackability of graphs, and many others. See [85] and references therein 
for a recent survey of the theory and applications of the JSR. 

Motivated by the abundance of applications, there has been much work on 
efficient computation of the joint spectral radius; see e.g. [32], [31], [122], and 
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references therein. Unfortunately, the negative results in the literature certainly 
restrict the horizon of possibilities. In [35], Blondel and Tsitsiklis prove that even 
when the set A consists of only two matrices, the question of testing whether 
p(A) < 1 is undecidable. They also show that unless P=NP, one cannot compute 
an approximation p of p that satisfies \p — p\ < ep, in a number of steps polynomial 
in the bit size of A and the bit size of e [161]. It is not difficult to show that the 
spectral radius of any finite product of length k raised to the power of 1/ k gives a 
lower bound on p [85]. However, for reasons that we explain next, our focus will 
be on computing upper bounds for p. 

There is an attractive connection between the joint spectral radius and the 
stability properties of an arbitrary switched linear system; i.e., dynamical systems 
of the form 

Xfc+i = A a (k)X k , (5.2) 

where a : Z — > {1, m} is a map from the set of integers to the set of indices. It 
is well-known that p < 1 if and only if system (5.2) is absolutely asymptotically 
stable (AAS), that is, (globally) asymptotically stable for all switching sequences. 
Moreover, it is known [95] that absolute asymptotic stability of (5.2) is equivalent 
to absolute asymptotic stability of the linear difference inclusion 

x k+ i e coA x k , (5.3) 

where co.4. here denotes the convex hull of the set A. Therefore, any method for 
obtaining upper bounds on the joint spectral radius provides sufficient conditions 
for stability of systems of type (5.2) or (5.3). Conversely, if we can prove absolute 
asymptotic stability of (5.2) or (5.3) for the set Ay : = {7^1, . . . , 7A71} for some 
positive scalar 7, then we get an upper bound of 1 on p(A). (This follows from the 
scaling property of the JSR: p(A~,) = Jp(A).) One advantage of working with the 
notion of the joint spectral radius is that it gives a way of rigorously quantifying 
the performance guarantee of different techniques for stability analysis of systems 
(5.2) or (5.3). 

Perhaps the most well-established technique for proving stability of switched 
systems is the use of a common (or simultaneous) Lyapunov function. The idea 
here is that if there is a continuous, positive, and homogeneous (Lyapunov) func- 
tion V(x) : M. n — > K that for some 7 > 1 satisfies 

V^Aix) < V(x) Vi = 1, . . . , m, VxGl", (5.4) 

(i.e., V(x) decreases no matter which matrix is applied), then the system in (5.2) 
(or in (5.3)) is AAS. Conversely, it is known that if the system is AAS, then 
there exists a convex common Lyapunov function (in fact a norm); see e.g. [85, 
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p. 24]. However, this function is not in general finitely constructable. A popular 
approach has been to try to approximate this function by a class of functions that 
we can efficiently search for using convex optimization and in particular semidef- 
inite programming. As we mentioned in our introductory chapters, semidefinite 
programs (SDPs) can be solved with arbitrary accuracy in polynomial time and 
lead to efficient computational methods for approximation of the JSR. As an ex- 
ample, if we take the Lyapunov function to be quadratic (i.e., V(x) = x T Px), 
then the search for such a Lyapunov function can be formulated as the following 
SDP: 

P ^ ° (55) 
^AjPAi ■< P Vi = l,...,m. 

The quality of approximation of common quadratic Lyapunov functions is a 
well-studied topic. In particular, it is known [32] that the estimate pyi obtained 
by this method 1 satisfies 

1 Pv<A) <p(A) < Pv <A), (5.6) 

71/ 



where n is the dimension of the matrices. This bound is a direct consequence of 
John's ellipsoid theorem and is known to be tight [13]. 

In [122], the use of sum of squares (sos) polynomial Lyapunov functions of 
degree 2d was proposed as a common Lyapunov function for the switched system 
in (5.2). As we know, the search for such a Lyapunov function can again be 
formulated as a semidefinite program. This method does considerably better 
than a common quadratic Lyapunov function in practice and its estimate p v sos,2d 
satisfies the bound 

f) V SOS,2d(A) < p{A) < p V SOS,2d(A), (5.7) 



where 77 = min{m, Furthermore, as the degree 2d goes to infinity, 

the estimate p v sos,2d converges to the true value of p [122]. The semidefinite 
programming based methods for approximation of the JSR have been recently 
generalized and put in the framework of conic programming [134]. 

I 5.1.1 Contributions and organization of this chapter 

It is natural to ask whether one can develop better approximation schemes for the 
joint spectral radius by using multiple Lyapunov functions as opposed to requiring 



1 The estimate p V 2 is the reciprocal of the largest 7 that satisfies (5.5) and can be found by 
bisection. 
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simultaneous contractibility of a single Lyapunov function with respect to all the 
matrices. More concretely, our goal is to understand how we can write inequalities 
among, say, k different Lyapunov functions Vi(x), . . . ,Vk(x) that imply absolute 
asymptotic stability of (5.2) and can be checked via semidefinite programming. 

The general idea of using several Lyapunov functions for analysis of switched 
systems is a very natural one and has already appeared in the literature (although 
to our knowledge not in the context of the approximation of the JSR); see e.g. 
[136], [39], [81], [80], [64]. Perhaps one of the earliest references is the work on 
"piecewise quadratic Lyapunov functions" in [136]. However, this work is in the 
different framework of state dependent switching, where the dynamics switches 
depending on which region of the space the trajectory is traversing (as opposed 
to arbitrary switching). In this setting, there is a natural way of using several 
Lyapunov functions: assign one Lyapunov function per region and "glue them to- 
gether" . Closer to our setting, there is a body of work in the literature that gives 
sufficient conditions for existence of piecewise Lyapunov functions of the type 
max{x T PiX, . . . , x T Pkx}, min{x T Pi:r, . . . , x T Pkx}, and conv{a; T PiX, . . . , x T Pkx}, 
i.e, the pointwise maximum, the pointwise minimum, and the convex envelope 
of a set of quadratic functions [81], [80], [64], [82]. These works are mostly con- 
cerned with analysis of linear differential inclusions in continuous time, but they 
have obvious discrete time counterparts. The main drawback of these methods 
is that in their greatest generality, they involve solving bilinear matrix inequal- 
ities, which are non-convex and in general NP-hard. One therefore has to turn 
to heuristics, which have no performance guarantees and their computation time 
quickly becomes prohibitive when the dimension of the system increases. More- 
over, all of these methods solely provide sufficient conditions for stability with no 
performance guarantees. 

There are several unanswered questions that in our view deserve a more thor- 
ough study: (i) With a focus on conditions that are amenable to convex optimiza- 
tion, what are the different ways to write a set of inequalities among k Lyapunov 
functions that imply absolute asymptotic stability of (5.2)? Can we give a uni- 
fying framework that includes the previously proposed Lyapunov functions and 
perhaps also introduces new ones? (ii) Among the different sets of inequalities 
that imply stability, can we identify some that are less conservative than some 
other? (iii) The available methods on piecewise Lyapunov functions solely provide 
sufficient conditions for stability with no guarantee on their performance. Can 
we give converse theorems that guarantee the existence of a feasible solution to 
our search for a given accuracy? 

The contributions of this chapter to these questions are as follows. We propose 
a unifying framework based on a representation of Lyapunov inequalities with la- 
beled graphs and by making some connections with basic concepts in automata 
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theory. This is done in Section 5.2, where we define the notion of a path-complete 
graph (Definition 5.2) and prove that any such graph provides an approximation 
scheme for the JSR (Theorem 5.4). In Section 5.3, we give examples of families of 
path-complete graphs and show that many of the previously proposed techniques 
come from particular classes of simple path-complete graphs (e.g., Corollary 5.8, 
Corollary 5.9, and Remark 5.3.2). In Section 5.4, we characterize all the path- 
complete graphs with two nodes for the analysis of the JSR of two matrices. We 
determine how the approximations obtained from all of these graphs compare 
(Proposition 5.12). In Section 5.5, we study in more depth the approximation 
properties of a particular pair of "dual" path-complete graphs that seem to per- 
form very well in practice. Subsection 5.5.1 contains more general results about 
duality within path-complete graphs and its connection to transposition of ma- 
trices (Theorem 5.13). Subsection 5.5.2 gives an approximation guarantee for 
the graphs studied in Section 5.5 (Theorem 5.16), and Subsection 5.5.3 contains 
some numerical examples. In Section 5.6, we prove a converse theorem for the 
method of max-of-quadratics Lyapunov functions (Theorem 5.17) and an approx- 
imation guarantee for a new class of methods for proving stability of switched 
systems (Theorem 5.18). Finally, some concluding remarks and future directions 
are presented in Section 5.7. 

■ 5.2 Path-complete graphs and the joint spectral radius 

In what follows, we will think of the set of matrices A '■= {A\, A m } as a finite 
alphabet and we will often refer to a finite product of matrices from this set as 
a word. We denote the set of all words A it . . . A ix of length t by A 1 . Contrary 
to the standard convention in automata theory, our convention is to read a word 
from right to left. This is in accordance with the order of matrix multiplication. 
The set of all finite words is denoted by A*; i.e., A* = \J A 1 . 

The basic idea behind our framework is to represent through a graph all the 
possible occurrences of products that can appear in a run of the dynamical system 
in (5.2), and assert via some Lyapunov inequalities that no matter what occur- 
rence appears, the product must remain stable. A convenient way of representing 
these Lyapunov inequalities is via a directed labeled graph G(N, E). Each node of 
this graph is associated with a (continuous, positive definite, and homogeneous) 
Lyapunov function Vi(x) : W 1 — > R, and each edge is labeled by a finite product 
of matrices, i.e., by a word from the set A*. As illustrated in Figure 5.1, given 
two nodes with Lyapunov functions Vi(x) and Vj(x) and an edge going from node 
i to node j labeled with the matrix Ai, we write the Lyapunov inequality: 

Vj(Aix) < Vi(x) ViGl" (5.8) 
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Figure 5.1. Graphical representation of Lyapunov inequalities. The edge in the graph above 
corresponds to the Lyapunov inequality VAAix) < Vi(x). Here, A\ can be a single matrix from 
A or a finite product of matrices from A. 

The problem that we are interested in is to understand which sets of Lyapunov 
inequalities imply stability of the switched system in (5.2). We will answer this 
question based on the corresponding graph. 

For reasons that will become clear shortly, we would like to reduce graphs 
whose edges have arbitrary labels from the set A* to graphs whose edges have 
labels from the set A, i.e, labels of length one. This is explained next. 

Definition 5.1. Given a labeled directed graph G(N,E), we define its expanded 
graph G e (N e , E e ) as the outcome of the following procedure. For every edge 

G E with label An-... An G A k , where k > 1, we remove the edge 
and replace it with k new edges (s q ,s q+ i) G E e \ E : q G {0, . . . , k — 1}, 
where s = i and s fc = j ? (These new edges go from node i through k — 1 
newly added nodes si, . . . , Sk-i and then to node j .) We then label the new edges 
(i, si), . . . , (s q , s q+1 ), (s fe _i, j) with An,..., A ik respectively. 




Graph G(N,E) Expanded Graph G'(N'.E') 



Figure 5.2. Graph expansion: edges with labels of length more than one are broken into new 
edges with labels of length one. 

An example of a graph and its expansion is given in Figure 5.2. Note that if 
a graph has only labels of length one, then its expanded graph equals itself. The 
next definition is central to our development. 

Definition 5.2. Given a directed graph G(N,E) whose edges are labeled with 
words from the set A*, we say that the graph is path-complete, if for all finite 



2 It is understood that the node index s q depends on the original nodes i and j. To keep the 
notation simple we write s q instead of s 1 ^ . 
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words A ak . . . A ai of any length k (i.e. 
path in its expanded graph G e (N e , E e ) 
path are the labels A ai up to A Uk . 



for all words in A*), there is a directed 
such that the labels on the edges of this 



In Figure 5.3, we present seven path-complete graphs on the alphabet A = 
{Ai, A 2 }. The fact that these graphs are path-complete is easy to see for graphs 
Hi, H 2 ,G 3 , and G 4 , but perhaps not so obvious for graphs H 3 ,Gi, and G 2 . One 
way to check if a graph is path-complete is to think of it as a finite automaton by 
introducing an auxiliary start node (state) with free transitions to every node and 
by making all the other nodes be accepting states. Then, there are well-known 
algorithms (see e.g. [78, Chap. 4]) that check whether the language accepted by 
an automaton is A*, which is equivalent to the graph being path-complete. At 
least for the cases where the automata are deterministic (i.e., when all outgoing 
edges from any node have different labels), these algorithms are very efficient 
and have running time of only 0(|iV| 2 ). Similar algorithms exist in the symbolic 
dynamics literature; see e.g. [96, Chap. 3]. Our interest in path-complete graphs 
stems from the Theorem 5.4 below that establishes that any such graph gives a 
method for approximation of the JSR. We introduce one last definition before we 
state this theorem. 
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Figure 5.3. Examples of path-complete graphs for the alphabet {Ai, A2}. If Lyapunov func- 
tions satisfying the inequalities associated with any of these graphs are found, then we get an 
upper bound of unity on p(A\, A2). 



Definition 5.3. Let A = {A 1 ,...,A m } be a set of matrices. Given a path- 
complete graph G (N, E) and \N\ functions Vi{x), we say that {Vi(x)\i = 1, . . . , |iV|} 
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is a graph Lyapunov function (GLF) associated with G (N, E) if 

V j (L((i,j))x)<V i (x) \/xeW\ V(i,j)eE, 

where L G A* is the label associated with edge G E going from node 

i to node j . 

Theorem 5.4. Consider a finite set of matrices A = {Ai, . . . , A m }. For a scalar 
7 > 0, let Ay := {7A1, . . . ,7A m }. Let G(N,E) be a path-complete graph whose 
edges are labeled with words from A*. If there exist positive, continuous, and 
homogeneous 3 functions Vi{x), one per node of the graph, such that {Vi{x) \ i = 
1, . . . , \N\} is a graph Lyapunov function associated with G(N, E), then p(A) < ^. 

Proof. We will first prove the claim for the special case where the edge labels of 
G(N, E) belong to A 1 and therefore G(N, E) = G e (N e ,E e ). The general case 
will be reduced to this case afterwards. Let d be the degree of homogeneity of the 
Lyapunov functions Vi(x), i.e., Vi(Xx) = X d Vi(x) for all A G R. (The actual value 
of d is irrelevant.) By positivity, continuity, and homogeneity of Vi(x), there exist 
scalars and fa with < ojj < fa for i — 1, . . . , \N\, such that 

ai||a;|| d <Vi(a;)<AIN| d , (5-9) 

for all x G R" and for all % = 1, . . . , \N\, where \\x\\ here denotes the Euclidean 
norm of x. Let 

£= max — . (5.10) 

i,je{i,...,|iv|} 2 a. 

Now consider an arbitrary product A ak . . . A ai of length k. Because the graph is 
path-complete, there will be a directed path corresponding to this product that 
consists of k edges, and goes from some node i to some node j. If we write the 
chain of k Lyapunov inequalities associated with these edges (cf. Figure 5.1), then 
we get 

V^A^.^A^KV^x), 
which by homogeneity of the Lyapunov functions can be rearranged to 

Vj (A a , . . . A a , x)\^ 1 

V) - ) (5 - n) 

3 The requirement of homogeneity can be replaced by radial unboundcdness which is implied 
by homogeneity and positivity. However, since the dynamical system in (5.2) is homogeneous, 
there is no conservatism in asking Vi(x) to be homogeneous. 
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We can now bound the spectral norm of A a . . . A ai as follows: 



|Ar fc ---4ri|| < max 



Ok ' ' ' °"i 



x \X\ 



< — max 



OLjJ 



< 



ctj J 7 fe 
. 1 



where the last three inequalities follow from (5.9), (5.11), and (5.10) respectively. 
From the definition of the JSR in (5.1), after taking the k-th root and the limit 
k — > oo, we get that p(A) < ^ and the claim is established. 

Now consider the case where at least one edge of G(N, E) has a label of length 
more than one and hence G e (N e ) E e ) ^ G(N, E). We will start with the Lyapunov 
functions Vi(x) assigned to the nodes of G(N, E) and from them we will explicitly 
construct |iV e | Lyapunov functions for the nodes of G e (N e ,E e ) that satisfy the 
Lyapunov inequalities associated to the edges in E e . Once this is done, in view 
of our preceding argument and the fact that the edges of G e (N e , E e ) have labels 
of length one by definition, the proof will be completed. 

For j G iV e , let us denote the new Lyapunov functions by Vj(x). We give the 
construction for the case where |iV e | = \N\ + 1. The result for the general case 
follows by iterating this simple construction. Let s G N e \N be the added node in 
the expanded graph, and q, r G iV be such that (s, q) G E e and (r, s) G E e with 
A sq and A rs as the corresponding labels respectively. Define 

f Vj (x) , if j G N 

V?(-) = \ ' ' , . f . (5-12) 
[ V q {A sq x) , if J = s. 

By construction, r and q, and subsequently, A sq and A rs are uniquely defined and 
hence, {Vf (x) \ j G iV e } is well defined. We only need to show that 

V q (A sq x) < V s e (x) (5.13) 
V s e (A rs x) < V r (x). (5.14) 

Inequality (5.13) follows trivially from (5.12). Furthermore, it follows from (5.12) 
that 

V s (A rs x) = V q (A sq A rs x) 
< V r (x) , 
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where the inequality follows from the fact that for % e N, the functions Vi(x) 
satisfy the Lyapunov inequalities of the edges of G (N, E) . □ 

Remark 5.2.1. If the matrix A sq is not invertible, the extended function V^{x) as 
defined in (5.12) will only be positive semidefinite. However, since our goal is to 
approximate the JSR, we will never be concerned with invertibility of the matrices 
in A. Indeed, since the JSR is continuous in the entries of the matrices [85], we can 
always perturb the matrices slightly to make them invertible without changing 
the JSR by much. In particular, for any a > 0, there exist < e, 5 < a such that 

~ A sq + SI 
Asq ~ 1 + e 

is invertible and (5.12) — (5.14) are satisfied with A sq = A sq . 

To understand the generality of the framework of "path-complete graph Lya- 
punov funcitons" more clearly, let us revisit the path-complete graphs in Fig- 
ure 5.3 for the study of the case where the set A = {A 1: A 2 } consists of only two 
matrices. For all of these graphs if our choice for the Lyapunov functions V(x) or 
Vi(x) and V 2 (x) are quadratic functions or sum of squares polynomial functions, 
then we can formulate the well-established semidefinite programs that search for 
these candidate Lyapunov functions. 

Graph H 1: which is clearly the simplest possible one, corresponds to the well- 
known common Lyapunov function approach. Graph H 2 is a common Lyapunov 
function applied to all products of length two. This graph also obviously implies 
stability. 4 But graph H 3 tells us that if we find a Lyapunov function that decreases 
whenever Ai, A\, and A 2 A 1 are applied (but with no requirement when A ± A 2 is 
applied), then we still get stability. This is a priori not obvious and we believe 
this approach has not appeared in the literature before. Graph H 3 is also an 
example that explains why we needed the expansion process. Note that for the 
unexpanded graph, there is no path for any word of the form (A\A 2 ) k or of the 
form A 2 k_1 , for any k £ N. However, one can check that in the expanded graph 
of graph H 3 , there is a path for every finite word, and this in turn allows us to 
conclude stability from the Lyapunov inequalities of graph H 3 . 

The remaining graphs in Figure 5.3 which all have two nodes and four edges 
with labels of length one have a connection to the method of min-of-quadratics 
or max-of-quadratics Lyapunov functions [81], [80], [64], [82]. If Lyapunov in- 
equalities associated with any of these four graphs are satisfied, then either 
min{V r 1 (a;), V 2 (x)} or max{Vi(x),V 2 (x)} or both serve as a common Lyapunov 



By slight abuse of terminology, we say that a graph implies stability meaning that the 
associated Lyapunov inequalities imply stability. 
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function for the switched system. In the next section, we assert these facts in a 
more general setting (Corollaries 5.8 and 5.9) and show that these graphs in some 
sense belong to "simplest" families of path-complete graphs. 

I 5.3 Duality and examples of families of path-complete graphs 

Now that we have shown that any path-complete graph introduces a method for 
proving stability of switched systems, our next focus is naturally on showing how 
one can produce graphs that are path-complete. Before we proceed to some basic 
constructions of such graphs, let us define a notion of duality among graphs which 
essentially doubles the number of path-complete graphs that we can generate. 

Definition 5.5. Given a directed graph G(N, E) whose edges are labeled from 
the words in A*, we define its dual graph G'(N,E') to be the graph obtained by 
reversing the direction of the edges of G, and changing the labels A ak . . . A ai of 
every edge of G to its reversed version A ai . . . A a , . 











Graph G, — 


Graph G; 



Figure 5.4. An example of a pair of dual graphs. 

An example of a pair of dual graphs with labels of length one is given in 
Figure 5.4. The following theorem relates dual graphs and path-completeness. 

Theorem 5.6. If a graph G(N, E) is path- complete, then its dual graph G'(N, E') 
is also path- complete. 

Proof. Consider an arbitrary finite word A ik . . . A ix . By definition of what it 
means for a graph to be path-complete, our task is to show that there exists a 
path corresponding to this word in the expanded graph of the dual graph G'. It is 
easy to see that the expanded graph of the dual graph of G is the same as the dual 
graph of the expanded graph of G; i.e, G' e (N e } E' e ) = G e (N e , E e ). Therefore, we 
show a path for Ai k . . . A^ in G e . Consider the reversed word A^ . . . Ai k . Since G 
is path-complete, there is a path corresponding to this reversed word in G e . Now 
if we just trace this path backwards, we get exactly a path for the original word 
Ai k . . . A{ x in G e . This completes the proof. □ 
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The next proposition offers a very simple construction for obtaining a large 
family of path-complete graphs with labels of length one. 

Proposition 5.7. A graph having any of the two properties below is path- complete. 
Property (i): every node has outgoing edges with all the labels in A. 
Property (ii): every node has incoming edges with all the labels in A. 

Proof. If a graph has Property (i), then it is obviously path-complete. If a graph 
has Property (ii), then its dual has Property (i) and therefore by Theorem 5.6 it 
is path-complete. □ 

Examples of path-complete graphs that fall in the category of this proposition 
include graphs G\, G2, G3, and G4 in Figure 5.3 and all of their dual graphs. By 
combining the previous proposition with Theorem 5.4, we obtain the following two 
simple corollaries which unify several linear matrix inequalities (LMIs) that have 
been previously proposed in the literature. These corollaries also provide a link 
to min/max-of-quadratics Lyapunov functions. Different special cases of these 
LMIs have appeared in [81], [80], [64], [82], [93], [53]. Note that the framework of 
path-complete graph Lyapunov functions makes the proof of the fact that these 
LMIs imply stability immediate. 

Corollary 5.8. Consider a set of m matrices and the switched linear system in 
(5.2) or (5.3). If there exist k positive definite matrices Pj such that 



for some 7 > 1, then the system is absolutely asymptotically stable. Moreover, 
the pointwise minimum 

min{a; P±x, . . . , x P k x} 
of the quadratic functions serves as a common Lyapunov function. 

Proof. The inequalities in (5.15) imply that every node of the associated graph 
has outgoing edges labeled with all the different m matrices. Therefore, by Propo- 
sition 5.7 the graph is path-complete, and by Theorem 5.4 this implies absolute 
asymptotic stability. The proof that the pointwise minimum of the quadratics is 
a common Lyapunov function is easy and left to the reader. □ 

Corollary 5.9. Consider a set of m matrices and the switched linear system in 
(5.2) or (5.3). If there exist k positive definite matrices Pj such that 



V(*,A;)G{l,...,m} 2 , 3je{l,...,m} 
such that <y 2 AjPjAi ^ P k 



(5.15) 



V(*,j)e{i,...,m} 2 , 3ke{l,...,m} 
such that ^AjPjAi ^ P k , 



(5.16) 
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for some 7 > 1, then the system is absolutely asymptotically stable. Moreover, 
the pointwise maximum 

max{x T PiX, . . . , x T Pkx} 
of the quadratic functions serves as a common Lyapunov function. 

Proof. The inequalities in (5.16) imply that every node of the associated graph has 
incoming edges labeled with all the different m matrices. Therefore, by Proposi- 
tion 5.7 the graph is path-complete and the proof of absolute asymptotic stability 
then follows. The proof that the pointwise maximum of the quadratics is a com- 
mon Lyapunov function is again left to the reader. □ 

Remark 5.3.1. The linear matrix inequalities in (5.15) and (5.16) are (convex) 
sufficient conditions for existence of min-of-quadratics or max-of-quadratics Lya- 
punov functions. The converse is not true. The works in [81], [80], [64], [82] 
have additional multipliers in (5.15) and (5.16) that make the inequalities non- 
convex but when solved with a heuristic method contain a larger family of min- 
of-quadratics and max-of-quadratics Lyapunov functions. Even if the non-convex 
inequalities with multipliers could be solved exactly, except for special cases where 
the 5-procedure is exact (e.g., the case of two quadratic functions), these meth- 
ods still do not completely characterize min-of-quadratics and max-of-quadratics 
functions. 

Remark 5.3.2. The work in [93] on "path-dependent quadratic Lyapunov func- 
tions" and the work in [53] on "parameter dependent Lyapunov functions" -when 
specialized to the analysis of arbitrary switched linear systems-are special cases 
of Corollary 5.8 and 5.9 respectively. This observation makes a connection be- 
tween these techniques and min/max-of-quadratics Lyapunov functions which is 
not established in [93], [53]. It is also interesting to note that the path-complete 
graph corresponding to the LMIs proposed in [93] (see Theorem 9 there) is the 
well-known De Bruijn graph [67]. 

The set of path-complete graphs is much broader than the set of simple family 
of graphs constructed in Proposition 5.7. Indeed, there are many graphs that are 
path-complete without having outgoing (or incoming) edges with all the labels on 
every node; see e.g. graph iff in Figure 5.5. This in turn means that there are 
several more sophisticated Lyapunov inequalities that we can explore for proving 
stability of switched systems. Below, we give one particular example of such 
"non-obvious" inequalities for the case of switching between two matrices. 
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Proposition 5.10. Consider the set A = {Ax, A 2 } and the switched linear system 
in (5.2) or (5.3). If there exist a positive definite matrix P such that 

1 2 A T 1 PA 1 d P, 
1 \A 2 A 1 ) T P(A 2 A 1 ) ± P, 
f{AlA x ) T P{AlA x ) d P, 

fAfPAl H P, 

for some 7 > 1, then the system is absolutely asymptotically stable. 

Proof. The graph H4 associated with the LMIs above and its expanded version 
if| are drawn in Figure 5.5. We leave it as an exercise for the reader to show (e.g. 
by induction on the length of the word) that there is path for every finite word 
in HI. Therefore, H4 is path-complete and in view of Theorem 5.4 the claim is 
established. □ 

Remark 5.3.3. Proposition 5.10 can be generalized as follows: If a single Lyapunov 
function decreases with respect to the matrix products 

{A 1 ,A 2 A 1 ,AtA 1 ,...,A k 2 - 1 A 1 ,A k 2 } 

for some integer k > 1, then the arbitrary switched system consisting of the two 
matrices A\ and A 2 is absolutely asymptotically stable. We omit the proof of 
this generalization due to space limitations. We will later prove (Theorem 5.18) 
a bound for the quality of approximation of path-complete graphs of this type, 
where a common Lyapunov function is required to decrease with respect to prod- 
ucts of different lengths. 

When we have so many different ways of imposing conditions for stability, it is 
natural to ask which ones are better. The answer clearly depends on the combina- 
torial structure of the graphs and does not seem to be easy in general. Neverthe- 
less, in the next section, we compare the performance of all path-complete graphs 
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with two nodes for analysis of switched systems with two matrices. The connec- 
tions between the bounds obtained from these graphs are not always obvious. For 
example, we will see that the graphs Hi, G3, and G4 always give the same bound 
on the joint spectral radius; i.e, one graph will succeed in proving stability if and 
only if the other will. So, there is no point in increasing the number of decision 
variables and the number of constraints and impose G 3 or G 4 in place of Hi. The 
same is true for the graphs in H 3 and G 2 , which makes graph H 3 preferable to 
graph G 2 . (See Proposition 5.12.) 

1 5.4 Path-complete graphs with two nodes 

In this section, we characterize the set of all path-complete graphs consisting of 
two nodes, an alphabet set A = {Ai, A2}, and edge labels of unit length. We will 
elaborate on the set of all admissible topologies arising in this setup and compare 
the performance — in the sense of conservatism of the ensuing analysis — of different 
path-complete graph topologies. 

■ 5.4.1 The set of path-complete graphs 

The next lemma establishes that for thorough analysis of the case of two matrices 
and two nodes, we only need to examine graphs with four or fewer edges. 

Lemma 5.11. Let G ({1, 2} , E) be a path- complete graph with labels of length one 
for A = {Ai, A 2 }. Let {Vi, V2} be a graph Lyapunov function for G. If \E\ > 4, 
then, either 

(i) there exists e G E such that G ({1, 2} , E\e) is a path-complete graph, 
or 

(ii) either V\ or V2 or both are common Lyapunov functions for A. 

Proof If \E\ > 4, then at least one node has three or more outgoing edges. 
Without loss of generality let node 1 be a node with exactly three outgoing edges 
ei,e2,e3, and let L(ei) = L (e 2 ) = A\. Let V (e) denote the destination node 
of an edge e G E. If T>{ei) = T> (e 2 ) , then e± (or e 2 ) can be removed without 
changing the output set of words. If T> (ei) 7^ D (e 2 ) , assume, without loss of 
generality, that V (ei) = 1 and V (e 2 ) = 2. Now, if L (e 3 ) = A ± , then regardless 
of its destination node, can be removed. If L (e^) = A 2 and V (e^) = 1, then 
Vi is a common Lyapunov function for A. The only remaining possibility is that 
L (es) = A 2 and T> (e^) = 2. Note that there must be an edge e4 G E from node 

2 to node 1, otherwise either node 2 would have two self-edges with the same 
label or V 2 would be a common Lyapunov function for A. If L(e 4 ) = A 2 then it 
can be verified that G({1,2}, {ei, e 2 , e 3 , e 4 }) is path-complete and thus all other 
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edge can be removed. If there is no edge from node 2 to node 1 with label A 2 
then L(e A ) = Ai and node 2 must have a self-edge e$ € E with label L{e§) = A 2 , 
otherwise the graph would not be path-complete. In this case, it can be verified 
that e 2 can be removed without affecting the output set of words. □ 

It can be verified that a path-complete graph with two nodes and less than 
four edges must necessarily place two self-loops with different labels on one node, 
which necessitates existence of a common Lyapunov function for the underlying 
switched system. Since we are interested in exploiting the favorable properties of 
graph Lyapunov functions in approximation of the JSR, we will focus on graphs 
with four edges. 

Before we proceed, for convenience we introduce the following notation: Given 
a labeled graph G(N, E) associated with two matrices A\ and A 2 , we denote by 
G(N, E), the graph obtained by swapping of A 1 and A 2 in all the labels on every 
edge. 

H 5.4.2 Comparison of performance 

It can be verified that for path-complete graphs with two nodes, four edges, and 
two matrices, and without multiple self-loops on a single node, there are a total 
of nine distinct graph topologies to consider. Of the nine graphs, six have the 
property that every node has two incoming edges with different labels. These 
are graphs Gi, G 2 , G 2 , G%, G3, and G4 (Figure 5.3). Note that G± = G\ and 
G4 = G4. The duals of these six graphs, i.e., G\, G 2 , G 2 , G 3 = G%, G 3 = G3, and 
G' A = G4 have the property that every node has two outgoing edges with different 
labels. Evidently, G 3 , G3, and G 4 are self-dual graphs, i.e., they are isomorphic 
to their dual graphs. The self-dual graphs are least interesting to us since, as we 
will show, they necessitate existence of a common Lyapunov function for A (cf. 
Proposition 5.12, equation (5.18)). 

Note that all of these graphs perform at least as well as a common Lya- 
punov function because we can always take ]A_ (x) = V 2 (x). Furthermore, we 
know from Corollaries 5.9 and 5.8 that if Lyapunov inequalities associated with 
G 1 , G 2 , G 2 , G 3 , G 3 , and G 4 are satisfied, then max {V\ (x) , V 2 (x)} is a common 
Lyapunov function, whereas, in the case of graphs G\, G 2 , G 2 , G' 3 , G 3 , and G' A , 
the function min{Vi (x) ,V 2 (x)} would serve as a common Lyapunov function. 
Clearly, for the self-dual graphs G 3 , G 3 , and G 4 both max{Vi (x) ,V 2 (x)} and 
min {Vi (x) , V 2 (x)} are common Lyapunov functions. 

Notation: Given a set of matrices A = {A 1: • • • , A m } , a path-complete 
graph G (N, E) , and a class of functions V, we denote by f>v,G (A) , the upper 
bound on the JSR of A that can be obtained by numerical optimization of GLFs 
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V{ G V, % G N, defined over G. With a slight abuse of notation, we denote by 
Pv (A) , the upper bound that is obtained by using a common Lyapunov function 

V G V. 

Proposition 5.12. Consider the set A = {Ai,A 2 }, and let G\, G 2 , G 3 , G^, and 

if 3 be the path-complete graphs shown in Figure 5.3. Then, the upper bounds on 
the JSR of A obtained by analysis via the associated GLFs satisfy the following 
relations: 

Pv,gAA)= h,G'M) (5.17) 

and 

p v (A) = p v ,a 3 (A) = Pv ,G 3 (A) = Pv ,G 4 (A) (5.18) 

and 

Pv,g 2 {A) = p v ,H 3 {A) , p v ,G 2 {A) = pv,h 3 {A) (5.19) 

and 

Pv,g' 2 (A) = p v ,H> 3 (A) , p v ,e> 2 (A) = Pv ,^ (A) . (5.20) 

Proof. A proof of (5.17) in more generality is provided in Section 5.5 (cf. Corollary 
5.15). The proof of (5.18) is based on symmetry arguments. Let {V"i,V 2 } be 
a GLF associated with G3 (Vi is associated with node 1 and V 2 is associated 
with node 2). Then, by symmetry, {V2,Vi} is also a GLF for G3 (where V\ is 
associated with node 2 and V 2 is associated with node 1). Therefore, letting 

V = Vi + V 2 , we have that {V, V} is a GLF for G 3 and thus, V = Vi + V 2 is 
also a common Lyapunov function for A, which implies that pv,c 3 (A) > pv (A) . 
The other direction is trivial: If V G V is a common Lyapunov function for A, 
then {Vi, V 2 \ V± = V 2 = V} is a GLF associated with G3, and hence, pv,c 3 {A) < 
pv (A) ■ Identical arguments based on symmetry hold for G3 and G4. We now 
prove the left equality in (5.19), the proofs for the remaining equalities in (5.19) 
and (5.20) are analogous. The equivalence between G 2 and H 3 is a special case of 
the relation between a graph and its reduced model, obtained by removing a node 
without any self-loops, adding a new edge per each pair of incoming and outgoing 
edges to that node, and then labeling the new edges by taking the composition of 
the labels of the corresponding incoming and outgoing edges in the original graph; 
see [145], [144, Chap. 5]. Note that H 3 is an offspring of G 2 in this sense. This 
intuition helps construct a proof. Let {Vi, V 2 } be a GLF associated with G 2 . It 
can be verified that V\ is a Lyapunov function associated with H 3 , and therefore, 
pv,H 3 {A) < pv,G 2 (A) ■ Similarly, if V G V is a Lyapunov function associated with 
H 3 , then one can check that {Vi, V 2 | V x (x) = V (x) , V 2 (x) = V (A 2 x)} is a GLF 
associated with G 2 , and hence, pv,H 3 (A) > f>v,G 2 (A) . □ 
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Figure 5.6. A diagram describing the relative performance of the path-complete graphs of 
Figure 5.3 together with their duals and label permutations. The graphs placed in the same 
circle always give the same approximation of the JSR. A graph at the end of an arrow results 
in an approximation of the JSR that is always at least as good as that of the graph at the start 
of the arrow. When there is no directed path between two graphs in this diagram, either graph 
can outperform the other depending on the set of matrices A. 

Remark 5.4.1. Proposition 5.12 (equation 5.17) establishes the equivalence of the 
bounds obtained from the pair of dual graphs G\ and G[. This, however, is not 
true for graphs G 2 and G2 as there exist examples for which 

PV,G 2 (A) ^ p V ,G' 2 (A) , 
PV,g 2 (A) ^ p V ,G' 2 (A) . 

The diagram in Figure 5.6 summarizes the results of this section. We remark 
that no relations other than the ones given in Figure 5.6 can be made among 
these path-complete graphs. Indeed, whenever there are no relations between two 
graphs in Figure 5.6, we have examples of matrices Ai,A 2 (not presented here) 
for which one graph can outperform the other. 

The graphs G\ and G[ seem to statistically perform better than all other 
graphs in Figure 5.6. For example, we ran experiments on a set of 100 random 5x5 
matrices {A±, A2} with elements uniformly distributed in [—1, 1] to compare the 
performance of graphs G\, G2 and (^2- If hi each case we also consider the relabeled 
matrices (i.e., {A 2 , Ai}) as our input, then, out of the total 200 instances, graph 
Gi produced strictly better bounds on the JSR 58 times, whereas graphs G2 and 
G 2 each produced the best bound of the three graphs only 23 times. (The numbers 
do not add up to 200 due to ties.) In addition to this superior performance, the 
bound pv,Gi ({^1,^2}) obtained by analysis via the graph Gi is invariant under 
(i) permutation of the labels A 1 and A 2 (obvious), and (ii) transposing of Ai and 
A 2 (Corollary 5.15). These are desirable properties which fail to hold for G2 and 



Sec. 5.5. Further analysis of a particular family of path-complete graphs 



131 



G 2 or their duals. Motivated by these observations, we generalize G\ and its dual 
G[ in the next section to the case of m matrices and m Lyapunov functions and 
establish that they have certain appealing properties. We will prove (cf. Theorem 
5.16) that these graphs always perform better than a common Lyapunov function 
in 2 steps (i.e., the graph H 2 in Figure 5.3), whereas, this is not the case for G 2 
and G 2 or their duals. 

I 5.5 Further analysis of a particular family of path-complete graphs 

The framework of path-complete graphs provides a multitude of semidefinite pro- 
gramming based techniques for the approximation of the JSR whose performance 
vary with computational cost. For instance, as we increase the number of nodes 
of the graph, or the degree of the polynomial Lyapunov functions assigned to the 
nodes, or the number of edges of the graph that instead of labels of length one 
have labels of higher length, we obtain better results but at a higher computa- 
tional cost. Many of these approximation techniques are asymptotically tight, so 
in theory they can be used to achieve any desired accuracy of approximation. For 
example, 

p v sos,2d(A) — > p(A) as 2d — > oo, 

where \? sos > 2d denotes the class of sum of squares homogeneous polynomial Lya- 
punov functions of degree 2d. (Recall our notation for bounds from Section 5.4.2.) 
It is also true that a common quadratic Lyapunov function for products of higher 
length achieves the true JSR asymptotically [85]; i.e. 5 , 

{/ p V 2 (A*) ->■ p{A) as t ->■ oo. 

Nevertheless, it is desirable for practical purposes to identify a class of path- 
complete graphs that provide a good tradeoff between quality of approxima- 
tion and computational cost. Towards this objective, we propose the use of m 
quadratic Lyapunov functions assigned to the nodes of the De Bruijn graph of 
order 1 on m symbols for the approximation of the JSR of a set of m matrices. 
This graph and its dual are particular path-complete graphs with m nodes and m 2 
edges and will be the subject of study in this section. If we denote the quadratic 
Lyapunov functions by x T PiX, then we are proposing the use of linear matrix 
inequalities 

Pi >- Vi = l,...,m, , v 

^AJP 3 A % < P t = {!,..., m} 2 ^ 



5 By V 2 we denote the class of quadratic homogeneous polynomials. We drop the superscript 
"SOS" because nonnegative quadratic polynomials are always sums of squares. 
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or the set of LMIs 

Pi >■ Vi = l,...,m, , v 

7 2 AfP,A d ^ Vi,j = {l,...,m} 2 ^ 

for the approximation of the JSR of m matrices. Throughout this section, we 
denote the path-complete graphs associated with (5.21) and (5.22) with G\ and G[ 
respectively. (The De Bruijn graph of order 1, by standard convention, is actually 
the graph G[.) Observe that G± and G[ are indeed dual graphs as they can be 
obtained from each other by reversing the direction of the edges. For the case 
m = 2, our notation is consistent with the previous section and these graphs are 
illustrated in Figure 5.4. Also observe from Corollary 5.8 and Corollary 5.9 that 
the LMIs in (5.21) give rise to max-of-quadratics Lyapunov functions, whereas 
the LMIs in (5.22) lead to min-of-quadratics Lyapunov functions. We will prove 
in this section that the approximation bound obtained by these LMIs (i.e., the 
reciprocal of the largest 7 for which the LMIs (5.21) or (5.22) hold) is always the 
same and lies within a multiplicative factor of of the true JSR, where n is the 
dimension of the matrices. The relation between the bound obtained by a pair of 
dual path-complete graphs has a connection to transposition of the matrices in 
the set A. We explain this next. 

I 5.5.1 Duality and invariance under transposition 

In [63], [64], it is shown that absolute asymptotic stability of the linear difference 
inclusion in (5.3) defined by the matrices A = {A 1: . . . , A m } is equivalent to abso- 
lute asymptotic stability of (5.3) for the transposed matrices A T '■= {Aj, . . . , A^}. 
Note that this fact is immediately seen from the definition of the JSR in (5.1), 
since p(A) = p(A T ). It is also well-known that 

p v2 (A) = p v2 (A T ). 

Indeed, if x T Px is a common quadratic Lyapunov function for the set A, then it 
is easy to show that x T P~ 1 x is a common quadratic Lyapunov function for the 
set A T . However, this nice property is not true for the bound obtained from some 
other techniques. For instance, the next example shows that 

P V sosa(A) ^ p v sosa(A t ), (5.23) 

i.e, the upper bound obtained by searching for a common quartic sos polynomial 
is not invariant under transposition. 

Example 5.5.1. Consider the set of matrices A = {Ai, A 2 , A 3 , A^}, with 





' 10 


-6 


-1 




" -5 


9 


-14 




-14 


1 
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-8 


-2 


Ai = 


8 


1 


-16 


,M = 
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5 
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,A 3 = 
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-8 


-12 
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17 
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7 




16 


11 
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We have p v sos,i{A) = 21.411, but p v sos,i{A T ) = 21.214 (up to three significant 
digits). A 

Similarly, the bound obtained by non-convex inequalities proposed in [63] 
is not invariant under transposing the matrices. For such methods, one would 
have to run the numerical optimization twice — once for the set A and once for 
the set A T — and then pick the better bound of the two. We will show that by 
contrast, the bound obtained from the LMIs in (5.21) and (5.22) are invariant 
under transposing the matrices. Before we do that, let us prove a general result 
which states that for path-complete graphs with quadratic Lyapunov functions 
as nodes, transposing the matrices has the same effect as dualizing the graph. 

Theorem 5.13. Let G(N,E) be a path-complete graph, and let G'(N,E') be its 
dual graph. Then, 

Pv^g{A t )=pv^g'{A). (5.24) 

Proof. For ease of notation, we prove the claim for the case where the edge labels 
of G(N, E) have length one. The proof of the general case is identical. Pick an 
arbitrary edge G E going from node i to node j and labeled with some 

matrix Ai G A. By the application of the Schur complement we have 



P % A t 
AJ P~\ 



h <=> AfP~ 1 A l r< P- 1 . 



But this already establishes the claim since we see that Pi and Pj satisfy the LMI 
associated with edge G E when the matrix A\ is transposed if and only if 
P~ x and P^ 1 satisfy the LMI associated with edge G E' . □ 

Corollary 5.14. p V 2 iG (A) = p V 2 tG (A T ) if and only if p V 2, G (A) = p V 2 jG >(A). 

Proof. This is an immediate consequence of the equality in (5.24). □ 

It is an interesting question for future research to characterize the topologies 
of path-complete graphs for which one has Pv 2 ,g(A) = Pv 2 ,g(A t ). For example, 
the above corollary shows that this is obviously the case for any path-complete 
graph that is self-dual. Let us show next that this is also the case for graphs G\ 
and G' x despite the fact that they are not self-dual. 

Corollary 5.15. For the path-complete graphs G\ and G\ associated with the 
inequalities in (5.21) and (5.22), and for any class of continuous, homogeneous, 
and positive definite functions V , we have 



Pv,gAA) = Pv,G[(A). 



(5.25) 
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Moreover, if quadratic Lyapunov functions are assigned to the nodes of G\ and 
G[, then we have 

Pv*,gM) = Pv\gM T ) = fo,G'M) = p V 2 tG[ (A T ). (5.26) 

Proof The proof of (5.25) is established by observing that the GLFs associated 
with G\ and G[ can be derived from one another via V-(Aix) = Vi(x). (Note that 
we are relying here on the assumption that the matrices Ai are invertible, which as 
we noted in Remark 5.2.1, is not a limiting assumption.) Since (5.25) in particular 
implies that p^2 Gl (A) = py2 G ^(A), we get the rest of the equalities in (5.26) 
immediately from Corollary 5.14 and this finishes the proof. For concreteness, let 
us also prove the leftmost equality in (5.26) directly. Let P^, i — 1, . . . , to, satisfy 
the LMIs in (5.21) for the set of matrices A. Then, the reader can check that 

P^AP^Aj, i = l,...,m, 

satisfy the LMIs in (5.21) for the set of matrices A T . □ 



I 5.5.2 An approximation guarantee 

The next theorem gives a bound on the quality of approximation of the estimate 
resulting from the LMIs in (5.21) and (5.22). Since we have already shown that 
Pv 2 ,Gi("4) = Pv 2 ,G[(A), h is enough to prove this bound for the LMIs in (5.21). 

Theorem 5.16. Let A be a set of m matrices in W ixn with JSR p(A). Let 
p V 2 Gl (A) be the bound on the JSR obtained from the LMIs in (5.21). Then, 

L fo,aM) < P( A ) < Pv*, Gl (A)- (5-27) 



Proof. The right inequality is just a consequence of G\ being a path-complete 
graph (Theorem 5.4). To prove the left inequality, consider the set A 2 consisting 
of all to 2 products of length two. In view of (5.6), a common quadratic Lyapunov 
function for this set satisfies the bound 



1 MA 2 ) < p(A 2 ). 
n 



It is easy to show that 
See e.g. [85]. Therefore, 



p(A 2 )= P 2 (A). 

L pUA 2 )<p{A)- (5-28) 
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Now suppose for some 7 > 0, x T Qx is a common quadratic Lyapunov function 
for the matrices in A^; i.e., it satisfies 

Q y 

^(AiAjfQAiAj r< Q Vi,j = {1, . . . ,m} 2 . 
Then, we leave it to the reader to check that 

Pi = Q + QA i? i = l,...,m 

satisfy (5.21). Hence, 

p v2)Gl (^) <pi(^ 2 )' 

and in view of (5.28) the claim is established. □ 

Note that the bound in (5.27) is independent of the number of matrices. More- 
over, we remark that this bound is tighter, in terms of its dependence on n, than 
the known bounds for p v sos,2d for any finite degree 2d of the sum of squares poly- 
nomials. The reader can check that the bound in (5.7) goes asymptotically as 
Numerical evidence suggests that the performance of both the bound obtained by 
sum of squares polynomials and the bound obtained by the LMIs in (5.21) and 
(5.22) is much better than the provable bounds in (5.7) and in Theorem 5.16. 
The problem of improving these bounds or establishing their tightness is open. It 
goes without saying that instead of quadratic functions, we can associate sum of 
squares polynomials to the nodes of G\ and obtain a more powerful technique for 
which we can also prove better bounds with the exact same arguments. 

I 5.5.3 Numerical examples 

In the proof of Theorem 5.16, we essentially showed that the bound obtained 
from LMIs in (5.21) is tighter than the bound obtained from a common quadratic 
applied to products of length two. Our first example shows that the LMIs in 
(5.21) can in fact do better than a common quadratic applied to products of any 
finite length. 

Example 5.5.2. Consider the set of matrices A = {A\, A 2 } : with 



"10" 




r 1' 


1 


, A 2 = 


. - 1 



This is a benchmark set of matrices that has been studied in [13], [122], [6] because 
it gives the worst case approximation ratio of a common quadratic Lyapunov 
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function. Indeed, it is easy to show that p(A) = 1, but pv 2 (A) = Moreover, 
the bound obtained by a common quadratic function applied to the set A 1 is 

4,04') = 2*, 

which for no finite value of t is exact. On the other hand, we show that the LMIs 
in (5.21) give the exact bound; i.e., py2 Gl (A) = 1. Due to the simple structure of 
A 1 and A 2 , we can even give an analytical expression for our Lyapunov functions. 
Given any e > 0, the LMIs in (5.21) with 7 = 1/ (1 + e) are feasible with 

Pi 

for any b > and a > b/2e. A 

Example 5.5.3. Consider the set of randomly generated matrices A = {A±, A 2 , A 3 }, 
with 



a 


" 




' b 


" 





b 


P2 = 





a 



Ai 



-2 2 2 4 

0-4-1-6 

2 6 -8 

-2 -2 -3 1 -3 

-1 -5 2 6 -4 



,M = 



-5 -2 -4 6 -1 

114 3-5 

-2 3 -2 8 -1 

8-6 2 5 

-1-5 1 7 -4 



, A3 = 



3-8-3 2 -4 

-2 -2 -9 4 -1 

2 2-5-8 6 

-4 -1 4-3 

5 -3 5 



A lower bound on p(A) is p(AiA 2 A2) 1 ^ = 11.8015. The upper approximations 
for p(A) that we computed for this example are as follows: 



MA) 
Pv^gAA) 

pySOSA (A) 



12.5683 

11.9575 
11.8097 
11.8015. 



(5.29) 



The bound p v sos,A matches the lower bound numerically and is most likely exact 
for this example. This bound is slightly better than py2 Gl . However, a simple 
calculation shows that the semidefinite program resulting in p v sosa has 25 more 
decision variables than the one for pv 2 ,Gi- Also, the running time of the algorithm 
leading to p v sos,i is noticeably larger than the one leading to Pv 2 ,Gi- m general, 
when the dimension of the matrices is large, it can often be cost-effective to 
increase the number of the nodes of our path-complete graphs but keep the degree 
of the polynomial Lyapunov functions assigned to its nodes relatively low. A 



I 5.6 Converse Lyapunov theorems and approximation with arbitrary accu- 
racy 

It is well-known that existence of a Lyapunov function which is the pointwise max- 
imum of quadratics is not only sufficient but also necessary for absolute asymptotic 
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stability of (5.2) or (5.3); see e.g. [105]. This is perhaps an intuitive fact if we 
recall that switched systems of type (5.2) and (5.3) always admit a convex Lya- 
punov function. Indeed, if we take "enough" quadratics, the convex and compact 
unit sublevel set of a convex Lyapunov function can be approximated arbitrar- 
ily well with sublevel sets of max-of-quadratics Lyapunov functions, which are 
intersections of ellipsoids. This of course implies that the bound obtained from 
max-of-quadratics Lyapunov functions is asymptotically tight for the approxima- 
tion of the JSR. However, this converse Lyapunov theorem does not answer two 
natural questions of importance in practice: (i) How many quadratic functions 
do we need to achieve a desired quality of approximation? (ii) Can we search 
for these quadratic functions via semidefinite programming or do we need to re- 
sort to non-convex formulations? Our next theorem provides an answer to these 
questions. 

Theorem 5.17. Let A be a set of m matrices in IR raxri . Given any positive 
integer I, there exists an explicit path-complete graph G consisting of m 1 " 1 nodes 
assigned to quadratic Lyapunov functions and m l edges with labels of length one 
such that the linear matrix inequalities associated with G imply existence of a 
max-of-quadratics Lyapunov function and the resulting bound obtained from the 
LMIs satisfies 

1 --Pv*,g(A) < P {A) < Pv*,g(A). (5.30) 



Proof. Let us denote the m 1 " 1 quadratic Lyapunov functions by x T Pi 1 „_i ll x, 
where Zi . . . G {l,...,m}' _1 is a multi-index used for ease of reference to 
our Lyapunov functions. We claim that we can let G be the graph dual to the 
De Bruijn graph of order / — 1 on m symbols. The LMIs associated to this graph 
are given by 



Piii2...ii-2H-i 

y Vii . . . g {l,...,m} 

A T P- ■ ■ A ■ < P ■ 

Vii . . . G {l,...,™}' -1 , 
Vj G {!,..., m}. 



i-i 



(5.31) 



The fact that G is path-complete and that the LMIs imply existence of a max-of- 
quadratics Lyapunov function follows from Corollary 5.9. The proof that these 
LMIs satisfy the bound in (5.30) is a straightforward generalization of the proof 
of Theorem 5.16. By the same arguments we have 

-±=fa(A l ) < M- (5-32) 
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Suppose x T Qx is a common quadratic Lyapunov function for the matrices in A 1 ; 
i.e., it satisfies 

Q y 

■< Q Vii . . .ii e {1, . . . ,m} 1 . 

Then, it is easy to check that 6 

P%\ii...ii-ik-\ ~ Q Ai [ _ 1 QAi l _ 1 

+(Ai l _ 2 A il _ 1 ) Q(A il _ 2 A il _ 1 ) + ■ ■ ■ 

+(Ai 1 Ai 2 . . . Ai t _ 2 Ai l _ 1 ) Q{Ai 1 Ai 2 . . . Ai l _ 2 Ai l _ 1 ), 

H . . E {1, . . . , m} 1-1 , 

satisfy (5.31). Hence, 

Pv*,g(A) < pl 2 (A l ), 

and in view of (5.32) the claim is established. □ 

Remark 5.6.1. A converse Lyapunov theorem identical to Theorem 5.17 can be 
proven for the min-of-quadratics Lyapunov functions. The only difference is that 
the LMIs in (5.31) would get replaced by the ones corresponding to the dual graph 

of a. 

Our last theorem establishes approximation bounds for a family of path- 
complete graphs with one single node but several edges labeled with words of 
different lengths. Examples of such path-complete graphs include graph H 3 in 
Figure 5.3 and graph iJ 4 in Figure 5.5. 

Theorem 5.18. Let A be a set of matrices in IR nxn . Let G ({1} , E) be a path- 
complete graph, and I be the length of the shortest word in A = {L (e) : e G E} . 
Then p V 2,^(A) provides an estimate of p(A) that satisfies 

Pv2,g (A) < p{A) < p V 2, d (A). 



Proof. The right inequality is obvious, we prove the left one. Since both pv 2 ,c {A) 
and p are homogeneous in A, we may assume, without loss of generality, that 
Pv2,q (A) = 1. Suppose for the sake of contradiction that 

p(A) < 1/ ^n. (5.33) 



6 The construction of the Lyapunov function here is a special case of a general scheme for 
constructing Lyapunov functions that are monotonically decreasing from those that decrease 



only every few steps; see [1, p. 58]. 
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We will show that this implies that f>v 2 ,a (A) < 1. Towards this goal, let us first 
prove that p(A) < p l (A). Indeed, if we had p{A) > p l (A), then there would 
exist 7 an integer i and a product A a e A 1 such that 

p\{A <J )>p\A). (5.34) 

Since we also have A a e «4 J (for some j > 2/), it follows that 

pHAt) <P(A). (5.35) 
The inequality in (5.34) together with p(A) < 1 gives 

pHa*) >pHa) >p(A). 

But this contradicts (5.35). Hence we have shown 

p(A) < p\A). 

Now, by our hypothesis (5.33) above, we have that p(A) < 1/ 1 \fn. Therefore, there 
exists e > such that p((l + e)A) < \j\fn. It then follows from (5.6) that there 
exists a common quadratic Lyapunov function for (l + e)A Hence, pv2((l + e) v 4.) < 
1, which immediately implies that pv 2 ,c (A) < 1> a contradiction. □ 

A noteworthy immediate corollary of Theorem 5.18 (obtained by setting A = 
{Jl^A 1 ) is the following: If p{A) < -J^=, then there exists a quadratic Lyapunov 
function that decreases simultaneously for all products of lengths r, r+1, . . . , r+k, 
for any desired value of k. Note that this fact is obvious for r = 1, but nonobvious 
for r > 2. 

H 5.7 Conclusions and future directions 

We introduced the framework of path-complete graph Lyapunov functions for the 
formulation of semidefinite programming based algorithms for approximating the 
joint spectral radius (or equivalently establishing absolute asymptotic stability of 
an arbitrary switched linear system). We defined the notion of a path-complete 
graph, which was inspired by concepts in automata theory. We showed that 
every path-complete graph gives rise to a technique for the approximation of the 
JSR. This provided a unifying framework that includes many of the previously 
proposed techniques and also introduces new ones. (In fact, all families of LMIs 

7 Here, we are appealing to the well-known fact about the JSR of a general set of matrices 
B: p{B) = limsup k ^ QO mstx BeB k pk (B). See e.g. [85, Chap. 1]. 
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that we are aware of are particular cases of our method.) We shall also emphasize 
that although we focused on switched linear systems because of our interest in 
the JSR, the analysis technique of multiple Lyapunov functions on path-complete 
graphs is clearly valid for switched nonlinear systems as well. 

We compared the quality of the bound obtained from certain classes of path- 
complete graphs, including all path-complete graphs with two nodes on an al- 
phabet of two matrices, and also a certain family of dual path-complete graphs. 
We proposed a specific class of such graphs that appear to work particularly well 
in practice and proved that the bound obtained from these graphs is invariant 
under transposition of the matrices and is always within a multiplicative factor of 
1 / \fn from the true JSR. Finally, we presented two converse Lyapunov theorems, 
one for the well-known methods of minimum and maximum-of-quadratics Lya- 
punov functions, and the other for a new class of methods that propose the use 
of a common quadratic Lyapunov function for a set of words of possibly different 
lengths. 

We believe the methodology proposed in this chapter should straightforwardly 
extend to the case of constrained switching by requiring the graphs to have a 
path not for all the words, but only the words allowed by the constraints on the 
switching. A rigorous treatment of this idea is left for future work. 

Vincent Blondel showed that when the underlying automaton is not deter- 
ministic, checking path-completeness of a labeled directed graph is an NP-hard 
problem (personal communication). In general, the problem of deciding whether 
a non-deterministic finite automaton accepts all finite words is known to be 
PSPACE-complete [61, p. 265]. However, we are yet to investigate whether the 
same is true for automata arising from path-complete graphs which have a little 
more structure. At the moment, the NP-hardness proof of Blondel remains as the 
strongest negative result we have on this problem. Of course, the step of checking 
path-completeness of a graph is done offline and prior to the run of our algo- 
rithms for approximating the JSR. Therefore, while checking path-completeness 
is in general difficult, the approximation algorithms that we presented indeed run 
in polynomial time since they work with a fixed (a priori chosen) path-complete 
graph. Nevertheless, the question on complexity of checking path-completeness 
is interesting in many other settings, e.g., when deciding whether a given set of 
Lyapunov inequalities imply stability of an arbitrary switched system. 

Some other interesting questions that can be explored in the future are the 
following. What are some other classes of path-complete graphs that lead to new 
techniques for proving stability of switched systems? How can we compare the 
performance of different path-complete graphs in a systematic way? Given a set 
of matrices, a class of Lyapunov functions, and a fixed size for the graph, can 
we efficiently come up with the least conservative topology of a path-complete 
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graph? Within the framework that we proposed, do all the Lyapunov inequalities 
that prove stability come from path-complete graphs? What are the analogues of 
the results of this chapter for continuous time switched systems? To what extent 
do the results carry over to the synthesis (controller design) problem for switched 
systems? These questions and several others show potential for much follow-up 
work on path-complete graph Lyapunov functions. 
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